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i  Those  of  us  involved  In  the  creation  of  the  Handbook  of  Artificial  Intelligence,  both 
writers  and  editors,  have  attempted  to  make  the  concepts,  methods,  tools,  and  main  results 
of  artificial  Intelligence  research  accessible  to  a  broad  scientific  and  engineering  audience. 
Currently,  Al  work  Is  familiar  mainly  to  Its  practicing  specialists  and  other  interested 
computer  scientists.  Yet  the  field  Is  of  growing  Interdisciplinary  Interest  and  practical 
Importance.  With  this  book  we  are  trying  to  build  bridges  that  are  easily  crossed  by 
engineers,  scientists  In  other  fields,  and  our  own  computer  science  colleagues.  ) 

In  the  Handbook  we  Intend  to  cover  the  breadth  and  depth  of  Al,  presenting  general 
overviews  of  the  scientific  Issues,  ns  well  as  detailed  discussions  of  particular  techniques 
and  Important  Al  systems.  Throughout  we  have  tried  to  keep  In  mind  the  reader  who  Is  not  a 
specialist  In  Al. 

As  the  cost  of  computation  continues  to  fall,  new  areas  of  computer  applications 
become  potentially  viable.  For  many  of  these  areas,  there  do  not  exist  mathematical  "cores" 
to  structure  calculational  use  of  the  computer.  Such  areas  will  Inevitably  be  served  by 
symbolic  models  and  symbolic  inference  techniques.  Yet  those  who  understand  symbolic 
computation  have  been  speaking  largely  to  themselves  for  twenty  years.  We  feel  that  it  is 
urgent  for  Al  to  "go  public"  In  the  manner  Intended  by  the  Handbook. 

Several  other  writers  have  recognized  a  need  for  more  widespread  knowledge  of  Al 
and  have  attempted  to  help  fill  the  vacuum.  Lay  reviews,  in  particular  Margaret  Boden's 
Artificial  Intelligence  and  Natural  Man,  have  tried  to  explain  what  is  important  and 
lnteres*ing  a^ovl  Al,  and  how  research  In  Al  progresses  through  our  programs.  In  addition, 
there  are.  a  iew  textbooks  that  attempt  to  present  a  more  detailed  view  of  selected  ar6as 
of  Al,  for  the  serious  student  of  computer  science.  But  no  textbook  can  hope  to  describe  all 
of  the  sub-areas,  to  present  brief  explanations  of  the  important  ideas  and  techniques,  and  to 
review  the  forty  or  fifty  most  Important  Al  systems. 

The  Handbook  contains  several  different  types  of  articles.  Key  Al  ideas  and  techniques 
are  described  In  core  articles  (e  g.,  basic  concepts  In  heuristic  search,  semantic  nets). 
Important  Individual  Al  programs  (e.g.,  SHRDLU)  are  described  In  separate  articles  that 
indicate,  among  other  things,  the  designer's  goal,  the  techniques  e  ployed,  and  the  reasons 
why  the  program  is  Important.  Overview  articles  discuss  the  problems  and  approaches  in 
each  major  area.  The  overview  articles  should  be  particularly  useful  to  those  who  seek  a 
summary  of  the  underlying  Issues  that  motivate  Al  research. 

Eventually  the  Handbook  will  contain  approximately  two  hundred  articles.  We  hope  that 
the  appearance  of  this  material  will  stimulate  Interaction  and  cooperation  with  other  Al 
research  sites.  We  look  forward  to  being  advised  of  errors  of  omission  and  commission.  For  a 
field  as  fast  moving  as  Al,  it  Is  Important  that  Its  practitioners  alert  us  to  Important 
developments,  so  that  future  editions  will  reflect  this  new  material.  We  intend  that  the 
Handbook  of  Artificial  intelligence  be  a  living  and  changing  reference  work. 

The  articles  In  tills  edition  of  the  Handbook  were  written  primarily  by  graduate  students 
in  Al  at  Stanford  University,  with  assistance  from  graduate  students  and  Al  professionals  at 
other  Institutions.  We  wish  particularly  to  acknowledge  the  help  from  those  at  Rutgt  o 
University,  SRI  International,  Xerox  Palo  Alto  Research  Center,  MIT,  and  the  RAND 
Corporation. 

The  authors  oHthls- roport^  whlchpcontalns  the  section  of  the  Handbook  on  educational 
applications  research*^6re  William  Clancey,  James  Bennett,  and  Paul  Cohen.  Others  who 
contributed  to  or  commented  on  earlier  versions  of  this  section  Include  Lee  Blaine,  John  Seely 
Brown,  Richard  Burton,  Adele  Goldberg,  Ira  Goldstein,  Albert  Stevens,  and  Keith  Wescourt. 
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Foreword 


Those  of  us  Involved  In  the  creation  of  the  Handbook  of  Artificial  Intelligence,  both 
writers  and  editors,  have  attempted  to  make  the  concepts,  methods,  tools,  and  main  results 
of  artificial  Intelligence  research  accessible  to  a  broad  scientific  and  engineering  audience. 
Currently,  Al  work  is  familiar  malniy  to  its  practicing  specialists  and  other  interested 
computer  scientists.  Yet  the  fleid  Is  of  growing  Interdisciplinary  Interest  and  practical 
Importance.  With  this  book  we  are  trying  to  build  bridges  that  are  easily  crossed  by 
engineers,  scientists  In  other  fields,  and  our  own  computer  science  colleagues. 

In  the  Handbook  we  Intend  to  cover  the  breadth  and  depth  of  Al,  presenting  general 
overviews  of  the  scientific  Issues,  as  well  as  detailed  discussions  of  particular  techniques 
and  Important  Al  systems.  Throughout  we  have  tried  to  keep  in  mind  the  reader  who  is  not  a 
specialist  In  Al. 

As  the  cost  of  computation  continues  to  fall,  new  areas  of  computer  applications 
become  potentially  viable.  For  many  of  these  areas,  there  do  not  exist  mathematical  "cores" 
to  structure  calculatlonal  use  of  the  computer.  Such  areas  will  Inevitably  be  served  by 
symbolic  models  and  symbolic  Inference  techniques.  Yet  those  who  understand  symbolic 
computation  have  been  sneaking  largely  to  themselves  for  twenty  years.  We  feel  that  it  is 
urgent  for  Al  to  "go  public"  In  the  manner  Intended  by  the  Handbook. 

Several  other  writers  have  recognized  a  need  for  more  widespread  knowledge  of  Al 
and  have  attempted  to  help  fili  the  vacuum.  Lay  reviews,  In  particular  Margaret  Boden's 
Artificial  Intelligence  and  Natural  Man,  have  tried  to  explain  what  is  important  and 
Interesting  about  Al,  and  how  research  In  Al  progresses  through  our  programs.  In  addition, 
there  are  a  few  textbooks  that  attempt  to  present  a  more  detailed  view  of  selected  ar&as 
of  Al,  for  the  serious  student  of  computer  science.  But  no  textbook  can  hope  to  describe  all 
of  the  sub-areas,  to  present  brief  explanations  of  the  Important  ideas  and  techniques,  and  to 
review  the  forty  or  fifty  most  Important  Al  systems. 

The  Handbook  contains  several  different  types  of  articles.  Key  Al  ideas  and  techniques 
are  described  In  core  articles  (e.g.,  basic  concepts  In  heuristic  search,  semantic  nets). 
Important  Individual  Al  programs  (e.g.,  SHRDLU)  are  described  In  separate  articles  that 
indicate,  among  other  things,  the  designer's  goal,  the  techniques  employed,  and  the  reasons 
why  the  program  is  Important.  Overview  articles  discuss  the  problems  and  approaches  In 
each  major  area.  The  overview  articles  should  be  particularly  useful  to  those  who  seek  a 
summary  of  the  underlying  Issues  that  motivate  Al  research. 


Eventually  the  Handbook  will  contain  approximately  two  hundred  articles.  We  hope  that 
the  appearance  of  this  material  will  stimulate  Interaction  and  cooperation  with  other  Al 
research  sites.  We  look  forward  to  being  p_vised  of  errors  of  omission  and  commission.  For  a 
field  as  fast  moving  as  Al,  It  Is  Important  that  Its  practitioners  alert  us  to  important 
developments,  so  that  future  editions  will  reflect  this  new  material.  We  intend  that  the 
Handbook  of  Artificial  Intelligence  be  a  living  and  changing  reference  work 

The  articles  In  tills  edition  of  the  Handbook  were  written  primarily  by  graduate  students 
in  Al  at  Sta  lford  Univers'ty,  with  assistance  from  graduate  students  and  Al  professionals  at 
other  institutions.  We  wish  particularly  to  acknowledge  the  help  from  those  at  Rutgers 
University,  SRI  International,  Xerox  Palo  Alto  Research  Center,  MIT,  and  the  RAND 
Corporation, 

The  authors  of  this- report,  which  contains  the  section  of  the  Handbook  on  educational 
applications  research,  are  William  Clancey,  James  Bennett,  and  Paul  Cohen.  Others  who 
contributed  to  or  commented  on  earlier  versions  of  this  section  Include  Lee  Blaine,  John  Seely 
Brown,  Richard  Burton,  Adele  Goldberg,  Ira  Goldstein,  Albert  Stevens,  and  Keith  Wescourt. 


Avron  Barr 
Edward  Feigenbaum 


Stanford  University 
July,  1979 
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A.  Historical  Overview 


Educational  applications  of  computer  technology  have  been  under  development  since 
the  early  1960s.  These  applications  have  included  scheduling  courses,  managing  teaching 
aids,  and  grading  tests.  The  predominant  application,  however,  has  involved  using  the 
computer  as  a  device  that  interacts  directly  with  the  student,  rather  than  as  an  assistant  to 
the  human  teacher  For  this  kind  of  application,  there  have  been  three  general  approaches. 

The  "ad  lib"  or  "environmental  approach"  Is  typified  by  Papert's  LOGO  laboravnry 
(Papert,  1970),  tiiat  allowed  students  more  or  iess  free-style  use  of  the  machine.  Student, 
are  involved  In  programming;  it  is  conjectured  that  learning  problem-solving  methods  takf  s 
place  as  a  side  effect  of  using  tools  that  are  designed  to  suggest  good  probiem-solv  ng 
strategies  to  the  student.  The  second  approach  uses  games  and  simulations  as  Instructional 
tools;  once  again  the  student  Is  involved  in  an  actlvity--for  example,  doing  simulated 
genetics  experlments--for  which  learning  Is  an  expected  side  effect.  The  third  computer 
application  In  education  is  computer-assisted  instruction  (CAI).  Unlike  the  first  two 
approaches,  CAI  makes  an  explicit  attempt  to  instigate  and  control  learning  (Howe,  1973). 
This  third  use  of  computer  technology  In  education  Is  the  focus  of  the  following  discussion. 

The  goal  of  CAI  research  Is  to  construct  Instructional  programs  that  incorporate  well- 
prepared  course  material  In  lessons  that  are  optimized  for  each  student.  Early  programs 
were  either  electronic  "page-turners"  which  printed  prepared  text  or  drili-and-practice 
monitors,  which  printed  problems  and  responded  to  the  student's  solutions  using  prestored 
answers  and  remedial  comments.  In  the  Intelligent  CAI  (ICAI)  programs  of  the  1970s,  course 
material  Is  represented  independently  of  teaching  procedures  so  that  problems  and  remedial 
comments  can  be  generated  differently  for  each  student.  Research  today  focuses  on  the 
design  of  programs  that  can  offer  instruction  In  a  manner  that  Is  sensitive  to  the  student's 
strengths,  weaknesses,  and  preferred  style  of  learning.  The  role  of  Ai  in  computer-based 
Instructional  applications  is  seen  as  making  possible  a  new  kind  of  learning  environment.- 

This  overview  surveys  how  AI  techniques  have  been  used  In  research  attempting  to 
create  Intelligent  computer-based  tutors.  In  the  next  article,  some  design  issues  are 
discussed  and  typical  components  of  iCAl  systems  are  described.  Subsequent  articles 
describe  some  important  applications  of  artificial  intelligence  techniques  In  instructional 
programs. 


Frame-oriented  CAI  Systems 

The  first  instructional  programs  took  many  forms,  but  all  adhered  to  essentially  the 
same  pedagogical  philosophy.  The  student  was  usually  given  some  Instructional  text 
(sometimes  without  using  the  computer)  and  asked  a  question  that  required  a  brief  answer. 
After  the  student  responded,  he  was  told  whether  his  answer  was  right  or  wrong.  The 
student's  response  was  sometimes  used  to  determine  his  "path"  through  the  curriculum  the 
sequence  of  problems  he  is  given  (see  Atkinson  &  Wilson,  1969).  When  the  student  made  an 
error,  the  program  branched  to  remedial  material. 

The  courseware  author  attempts  to  anticipate  every  wrong  response,  prespecifying 
branches  to  appropriate  remedial  material  based  on  his  ideas  about  what  might  be  the 
underlying  misconceptions  that  would  cause  each  wrong  response.  Branching  on  the  basis  of 
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response  was  the  first  step  toward  individualization  of  instruction  (Crowder,  1962).  This  style 
of  CAi  has  been  dubbed  ad-hoc,  frame-oriented  (AFO)  CAI  by  Carboneii  (1970b),  to  stress  its 
dependence  on  author-specified  units  of  information.  (The  term  "frame"  as  it  is  used  in  this 
context  predates  the  more  recent  usage  in  Ai--see  Article  Representation.B7--and  refers  to 
a  block  or  page  or  unit  of  information  or  text.)  Design  of  acl-noc  frames  was  originaily  based 
on  Skinnerian  stimuius/response  principles.  The  branching  strategies  of  some  AFO  programs 
have  become  quite  Involved,  Incorporating  the  best  iearning  theory  that  mathematical 
psychology  has  produced  (Atkinson,  1972;  Fietcher,  1976;  Kimbail,  1973).  Many  of  these 
systems  have  been  used  succesfuiiy  and  are  available  commercially. 


Intelligent  CAI 

In  spite  of  the  widespread  application  of  AFO  CAi  to  many  problem  areas,  many 
researchers  beiieve  that  most  AFO  courses  are  not  the  best  use  of  computer  technology: 

In  most  CAI  systems  of  the  AFO  type,  the  computer  does  little  more 
than  what  a  programmed  textbook  can  do,  and  one  may  wonder  why 
the  machine  Is  used  at  all.... When  teaching  sequences  are  extremely 
simple,  perhaps  trivial,  one  should  consider  doing  away  with  the 
computer,  and  using  other  devices  or  techniques  more  related  to  the 
task.  (Carboneii,  1970b,  pp.  32,  193) 

in  this  pioneering  paper,  Carboneii  goes  on  to  define  a  seond  type  of  CAi  that  is  known 
today  as  "knowiedge-based"  or  "Intelligent"  CAI  (iCAi).  Knowiedge-based  systems  and  the 
previous  CAI  systems  both  have  representations  of  the  subject  matter  they  teach,  but  ICAI 
systems  also  carry  on  a  natural  language  dialogue  with  the  student  and  use  the  student's 
mistakes  to  diagnose  his  misunderstandings. 

Early  uses  of  Ai  techniques  in  CAI  were  caiied  "generative  CAI"  (Wexier,  1970),  since 
they  stressed  the  ability  to  generate  problems  using  a  large  database  representing  the 
subject  they  taught.  (See  Koffman  &  Blount,  1976,  for  a  review  of  some  early  generative 
CAI  programs  and  an  exampie  of  the  possibilities  and  limitations  of  this  style  of  courseware.) 
However,  the  kind  of  courseware  that  Carboneii  was  describing  in  his  paper  was  to  be  more 
than  just  a  probiem  generator--lt  was  to  be  a  computer  tutor  that  had  the  inductive  powers 
of  its  human  counterparts.  iCAi  programs  offer  what  Brown  (1977)  calis  a  reactive  learning 
environment,  in  which  the  student  is  actively  engaged  with  the  instructional  system  and  his 
interests  and  misunderstandings  drive  the  tutorial  dialogue.  This  goal  was  expressed  by 
other  researchers  trying  to  write  CAI  programs  tiiat  extended  the  medium  beyond  the  iimits 
of  frame  selection: 

Often  It  is  not  sufficient  to  teli  a  student  he  Is  wrong  and  Indicate  the 
correct  solution  method.  An  Intelligent  G.i  system  should  be  abie  to 
make  hypotheses  based  on  a  student's  error  history  as  to  where  the 
reai  source  of  his  difficulty  iies.  (Koffman  &  Blount,  1976) 


The  Use  of  Ai  Techniques  in  iCAi 

The  realization  of  the  computer-based  tutor  has  involved  increasingly  complicated 
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computer  programs  and  has  prompted  CAI  researchers  to  use  artificial  intelligence 
techniques.  Artificial  Intelligence  work  In  natural  language  understanding,  representation  of 
knowledge,  and  methods  of  inference,  as  well  as  specific  Al  applications  like  algebraic 
simplification,  calculus,  and  theorem  proving,  have  been  applied  by  various  researchers 
toward  making  CAI  programs  that  are  more  intelligent  and  more  effective.  Early  research  on 
ICAI  systems  focused  on  representation  of  the  subject  matter.  Benchmark  efforts  include 
SCHOLAR,  the  geography  tutor  of  Carbonell  and  Collins  (see  article  Cl),  EXCHECK,  the  logic 
and  set  theory  tutors  by  Suppes  et  al.  (article  F 7),  and  SOPHIE,  the  electronics 
troubleshooting  tutor  of  Brown  and  Burton  (article  C3).  The  high  level  of  domain  expertise  in 
these  programs  permits  them  to  be  responsive  in  a  wide  range  of  problem-solving 
Interactions. 

These  ICAI  programs  are  quite  different  from  even  the  most  complicated  frame- 
oriented,  branching  program. 

Traditional  approaches  to  this  problem  using  decision  theory  and 
stochastic  models  have  reached  a  dead  end  due  to  their 
oversimplified  representation  of  learning.  ...  It  appears  within  reach 
of  Ai  methodology  to  develop  CAI  systems  that  act  more  like  human 
teachers.  (Laubsch,  1975) 

However,  an  Al  system  that  is  expert  In  a  particular  domain  is  not  necessarily  an  expert 
teacher  of  the  material-’ ICAI  systems  cannot  be  Al  systems  warmed  over"  (Brown,  1977).  A 
teacher  needs  to  understand  what  the  student  Is  doing,  not  Just  what  he  is  supposed  to  do. 
Al  programs  often  use  very  powerful  problem-solving  methods  that  do  not  resemble  those 
used  by  humans;  In  many  cases,  CAI  researcners  borrowed  Al  techniques  for  representing 
subject  domain  expertise  but  had  to  modify  them,  often  making  the  Inference  routines  ’ess 
powerful ,  In  order  to  force  them  to  follow  human  reasoning  patterns,  so  as  to  better  explain 
their  methods  to  the  student,  as  well  as  to  understand  his  methods  (Smith,  1976;  Goldberg, 
1973). 

In  the  mid-1970s,  a  second  phase  In  the  development  of  ICAI  tutors  has  been 
characterized  by  the  Inclusion  of  expertise  in  the  tutor  regarding  (a)  the  student's  learning 
behavior  and  (b)  tutoring  strategies  (Brown  &  Goldstein,  1977).  Al  techniques  are  used  to 
construct  models  of  the  learner  that  represent  his  knowledge  in  terms  of  "issues"  (see 
article  C4)  or  "skills"  (Barr  &  Atkinson,  1975)  that  should  bs  learned.  This  model  then 
controls  tutoring  strategies  for  presenting  the  material.  Finally,  some  ICAI  programs  are  now 
using  Al  techniques  to  explicitly  represent  these  tutoring  strategies,  gaining  the  advantages  of 
flexibility  and  modularity  of  representation  and  control  (Burton  &  Brown,  1979;  Goldstein, 
1977;  Clancey,  1979a). 


References 

The  best  general  review  of  research  In  ICAI  Is  Brown  &  Goldstein  (1977).  Several 
papers  on  recent  work  are  collected  In  a  special  Issue  of  the  International  Journal  of  Man- 
Machine  Studies,  Volume  11,  1979. 
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B,  issues  in  iCAl  Systems  Design 

The  main  components  of  iCAi  systems  are  (a)  its  problem-solving  expertise,  the 
knowledge  that  the  system  tries  to  Impart  to  the  student,  (b)  the  student  model,  in  ica  ing 
what  the  student  does  and  does  not  know,  and  (c)  tutoring  strategies ,  whiu '  »Pe«rf>r  Jowthe 
system  presents  material  to  the  student).  (See  ■‘'ef.  1974,  -or  an  ex. «"®nt.  < "c° 
the  differences  and  interrelations  of  tM  types  of  knowledge  needed  in  an  intelligent 
pJog^am  )  Not  ail  of  th  components  a.c  fully  developed  in  every  system.  Because  of  he 

size  and  complexity  of  v, diligent  CAI  programs,  most  researchers ‘‘f"!  full^usab  ^system 
efforts  on  the  development  of  a  single  part  of  what  would  constitute  a  fully  usable  system. 

Each  component  Is  described  briefly  below, 


The  Expertise  Module--Representing  Domain  Knowledge 

The  "expert"  component  of  an  ICAI  system  is  charged  with  the  task  of  generating 
problems  and  evaluating  the  correctness  of  student  solutions  The  CA.  system's  knowledge 
of  the  subject  matter  was  originally  envisioned  as  a  huge  static  database  that  incorporate 
°M  he  ?,cts to  be  taught.  Th"s  Idea  was  implicit  in  the  early  drill-and-prac.;ce  Program* end 
was  made  explicit  in  generative  CAI  (see  Article  A).  Representation  of  subject  matte 
expertise  in  this  way,  using  semantic  nets  (Article  Representst.on.Ba),  has  been  useful  for 
ene  tng  and  answering  questions  Involving  causa,  or  regional  reasoning  h-rtjnUJ 
Collins,  1973;  Laubsch,  1975;  and  see  Articles  Cl  and  C2  on  the  SCHOLAR  and  WHY 

systems). 

Recent  systems  have  used  procedural  representation  of  domain  knowledge  for ^xurnple, 
how  to  take  measurements  and  make  deductions  (see  Article  Representa tion.B9). 
knowledge  is  represented  as  procedural  experts  that  correspond  to  subsk.iis  that  a  student 
must  learn  in  order  to  acquire  the  complete  skill  being  taught  (Brown,  Burton,  &  BcM,  07®); 
Production  rules  (Article  Representation^)  h«ve  been  used  to  construe  "ocular 
representations  of  skills  and  problem-solving  methods  (ooidstein,  1977;  Clancey.  19  )_ 

addition,  Brown  &  Burton  (1975)  have  pointed  out  that  multiple  representations  are  sometime 
useful  for  answering  student  questions  and  for  evaluating  partial  solutions  to  a  probie  (  .»•. 
a  semantic  net  of  facts  about  an  electronic  circuit  and  procedures  simulating  the  functional 
behavior  of  the  circuit).  Stevens  &  Collins  (1978)  considered  an  evolving  senes  o 
"simulation"  models  that  can  be  used  to  reason  metaphorically  about  the  behavior  of  causa 

systems. 

it  should  be  noted  that  not  ail  ICAI  systems  can  actually  solve  the  problems .they  pose 
to  a  student.  For  example,  BIP,  the  BASIC  instructional  Program  (Barr, 

1976)  can't  write  or  analyze  computer  programs;  BIP  uses  sample  nput/output 
(supplied  by  the  course  authors)  to  test  students'  programs.  Similarly,  the  procedural 
experts  in  SOPHIE-1  could  not  debug  an  electronic  circuit,  in  contrast,  the  production  ru  e 
representation  of  domain  knowledge  used  in  WUMPUS  and  GUIDON  enable  these  programs  to 
solve  problems  Independently,  as  well  as  to  criticize  student  solutions  (Goldstein  197  7,  and 
Ciancey  1979a).  Being  able  to  soive  the  problems,  preferrabiy  in  all  possible  way 
cirrecti;  and  incorrectly,  is  necessary  if  the  ICAI  program  is  to  make  fine-grained 
suggestions  p.oout  the  completion  of  partial  solutions. 

An  Important  idea  in  this  connection  is  that  of  an  articulate  expert  (Goldstein,  1977). 
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Whereas  typical  expert  Al  progr.  is  have  data  structures  and  processing  algorithms  that  do 
not  necessarily  mimic  the  reasoning  steps  used  by  humans  and  are,  therefore,  considered 
"opaque"  to  the  user,  an  articulate  expert  for  an  iCAl  system  must  be  designed  to  enable 
the  explanatior  of  each  problem-solving  decision  that  it  makes  in  terms  that  correspond  (at 
some  level  of  abstraction)  to  those  of  a  human  problem  soiver.  For  example,  the  electronic 
circuit  simulator  underlying  SOPHiE-i  (see  Article  C3),  which  is  used  to  check  the  consistency 
of  a  student's  hypotheses  and  to  answer  some  of  his  questions,  is  an  opaque  expert  on  the 
functioning  of  the  circuit.  It  is  a  complete,  accurate  and  efficient  modei  of  vhe  circuit,  but  its 
mechanisms  are  never  reveaied  to  the  student  since  they  are  certainiy  not  the  meciianisms 
tiiat  he  is  exoected  to  acquire,  in  WEST,  on  the  other  hand,  whiie  a  (compete  and  efficient) 
opaque  expert  is  used  to  determine  the  range  of  possible  moves  that  the  student  couid  have 
made  with  a  given  roil  of  the  dice,  an  articulate  expert,  which  oniy  models  pieces  of  tiie 
game-piaying  expertise,  Is  used  to  determine  possible  causes  for  less-than-optimal  student 
moves. 

1C Ai  systems  are  distinguished  from  eariier  CAi  approaches  by  the  separation  of 
teaching  strategies  from  the  subject  expertise  to  be  taught.  However,  tiie  separat  on  of 
subject-area  knowledge  from  instructional  planning  requires  a  structure  for  organizing  the 
expertise  that  captures  the  difficulty  of  various  problems  and  the  interrelationships  of 
course  material.  Modeling  a  student's  understanding  of  a  subject  is  cioseiy  related 
conceptually  to  figuring  out  a  representation  for  the  subject  Itseif  or  for  the  language  used 
to  discuss  it. 

Trees  and  iattices  showing  prerequisite  interactions  have  been  used  to  organize  the 
introduction  of  new  knowledge  or  topics  (Koffman  &  Blount,  1975).  in  BtP  this  iattice  took 
the  form  of  a  curriculum  net  that  related  the  skiils  to  be  taught  to  example  programming  tasks 
that  exercised  each  skiii  (8arr,  Beard,  &  Atkinsoh,  1976).  Goidstein  (1979)  called  the 
lattice  a  syllabus  in  the  WUMPUS  program  and  emphasized  the  developmental  path  that  a 
learner  takes  in  acquiring  new  skills.  For  arithmetic  skiiis  used  in  WEST,  Burton  &  Brown 
(1976)  use  leveis  of  issues.  Issues  proceed  from  the  use  of  arithmetic  operators  to 
strategies  for  winning  the  game,  to  meta-ievel  considerations  for  improving  performance. 
Burton  and  Brown  beiieve  that  when  the  skiiis  are  "structurally  independent,"  the  order  of 
their  presentation  is  not  particularly  cruciai.  This  representation  is  useful  for  modeling  the 
student's  knowledge  and  coaching  him  on  different  ievets  of  abstraction.  Stevens,  Coliins,  & 
Goidin  (1978)  have  argued  further  that  a  good  human  tutor  does  not  merely  traverse  a 
predetermined  network  of  knowledge  in  selecting  material  to  present.  Rather,  it  is  the 
process  of  ferreting  out  student  misconceptions  that  drives  the  dialogue. 


The  Student  Model 

The  modeling  module  is  used  to  represent  the  student's  understanding  of  the  material 
to  be  taught.  Much  recent  iCAi  research  has  focused  on  this  component.  The  purpose  of 
modeling  the  student  is  to  make  hypotheses  about  his  misconceptions  and  suboptimal 
performance  strategies  so  that  the  tutoring  moduie  can  point  them  out,  indicate  why  they  are 
wrong,  and  suggest  corrections.  It  is  advantageous  for  the  system  to  be  able  to  recognize 
alternate  ways  of  solving  problems,  including  the  incorrect  methods  that  tiie  student  might 
use  resuiting  from  systematic  misconceptions  about  the  probiem  or  from  inefficient 
strategies. 


6 


Al  Application*  in  Education 


Some  early  frame-oriented  CAI  systems  used  mathematical  stochastic  leant  <  g  models,  but 
this  approach  failed  because  It  only  modeled  the  probability  that  a  student  would  give  a 
specific  response  to  a  stimulus.  In  general,  knowing  the  probability  of  a  response  is  not  the 
same  as  knowing  what  a  student  understands--the  former  has  little  diagnostic  power 
(Laubsch,  1976). 

Typical  uses  of  Al  techniques  for  modeling  student  knowledge  Include  (a)  simple  pattern 
recognition  applied  to  the  student's  response  history  and  (b)  flags  In  the  subject  matter 
semantic  net  or  in  the  rule  base  representing  areas  that  the  student  has  mastered  In  these 
ICAI  systems,  a  student  model  is  formed  by  comparing  the  student's  behavior  to  that  of  the 
computer-based  "expert"  In  the  same  environment.  The  modeling  component  marks  each  skill 
according  to  whether  evidence  Indicates  that  the  student  knows  the  material  or  not.  Carr  & 
Goldstein  (1977)  have  termed  this  component  an  overlay  model— the  student’s  understanding 
is  represented  completely  In  terms  of  the  expertise  component  of  the  program  (see  Article 
C5). 


In  contrast,  another  approach  is  to  model  the  student's  knowledge  not  as  a  subset  of 
the  expert's,  but  rattier  as  a  perturbation  or  deviation  from  the  expert's  knowledge--a 
"bug".  (C^e,  for  example,  the  SOPHIE  and  BUGGY  systems-Articles  C3  and  C6.)  There  is  a 
major  difference  between  the  overlay  and  "buggy"  approaches  to  modelling:  In  the  latter 
approach  it  is  not  assumed  that,  except  for  "knowing"  less,  the  student  reasons  as  the 
expert  does;  the  student's  reasoning  can  be  substantially  different  from  expert  rear  wing. 

Other  Information  that  might  be  accumulated  in  the  student  model  includes  the 
student's  preferred  modes  for  Interacting  with  the  program,  a  rough  characterization  of  his 
level  of  ability,  a  consideration  of  what  he  seems  to  forget  over  time,  and  an  indication  of 
what  his  goals  and  plans  seem  to  be  for  learning  the  subject  matter. 

Major  sources  of  evidence  used  to  maintain  the  student  model  can  be  characterized 
as:  (a)  implicit,  from  student  problem-solving  behavior;  (b)  explicit,  from  direct  questions 
asked  of  the  student;  (c)  historical,  from  assumptions  based  on  the  student's  experience; 
and  (d)  structural,  from  assumptions  based  on  some  measure  of  the  difficulty  of  the  subject 
material  (Goldstein,  1977).  Historical  evidence  Is  usually  determined  by  asking  the  student 
to  rate  his  level  of  expertise  on  a  scale  from  "beginner"  to  "expert."  Early  programs  like 
SCHOLAR  used  only  explicit  evidence.  Recent  programs  have  concentrated  on  inferring 
"implicit"  evidence  from  the  student's  problem-solving  behavior.  This  approach  is 
complicated  because  It  Is  limited  by  the  program's  ability  to  recognize  and  describe  the 
strategies  being  used  by  the  student.  Specifically,  when  the  expert  program  indicates  that 
an  Inference  chain  Is  required  for  a  correct  result  and  the  student's  observable  behavior  is 
wrong,  how  is  the  modeling  program  to  know  which  of  the  intermediate  steps  are  unknown  or 
wrongly  applied  by  the  student?  This  Is  the  apportionment  of  creditlblame  problem;  it  has  been 
an  Important  focus  of  WEST  research. 

Because  of  Inherent  limitations  in  the  modeling  process,  It  Is  useful  for  a  "critic"  in  the 
modeling  component  to  measure  how  closely  the  student  model  actually  predicts  the 
student's  behavior.  Extreme  inconsistency  or  an  unexpected  demonstration  of  expertise  in 
solving  problems  might  Indicate  that  the  representation  being  used  uy  the  program  does  not 
capture  the  student's  approach.  Finally,  Goldstein  (19 77)  has  suggested  that  the  modeling 
process  should  attempt  both  to  measure  whether  or  not  the  student  Is  actually  learning  and 
to  discern  what  teaching  methods  are  most  effective.  Much  work  remains  to  be  done  in  this 
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The  Tutoring  Module 

The  tutoring  moduie  of  iCAi  systems  must  integrate  knowledge  about  natural  ianguage 
dialogues,  teaching  methods,  and  tiie  subject  area  to  be  taught.  This  is  the  module  that 
communicates  with  the  student:  selecting  problems  for  him  to  soive,  monitoring  and  critici.  ng 
his  performance,  providing  assistance  upon  request,  and  selecting  temedial  material.  The 
design  of  this  moduie  involves  issues  like  "When  is  It  appropriate  to  offer  a  hintr”'  or  How 
far  should  the  student  be  aiiowed  to  go  down  the  wrong  track?" 

These  are  just  some  of  the  problems  which  stem  from  the  basic  fact 
that  teaching  is  a  skill  which  requires  knowledge  additional  to  the 
knowledge  comprising  mastery  of  the  subject  domain.  (Brown,  1977) 

This  additional  knowledge,  beyond  the  representation  of  the  subject  domain  and  of  the 
student's  knowledge,  is  about  how  to  teach. 

Most  iCAi  research  has  explored  teaching  methods  based  on  diagnostic  modeling  in 
which  the  program  debugs  the  student's  understanding  by  posing  tasks  and  evaluating  his 
response  (Coiilns,  1976;  Brown  &  Burton,  1975;  Koffman  &  Blount,  1975).  The  student  is 
expected  to  iearn  from  the  program's  feedback  which  skills  he  uses  wrongly,  which  skills  he 
does  not  use  (but  could  use  to  good  advantage),  etc.  Recently,  there  has  been  more 
concern  with  the  possibility  of  saying  Just  the  right  thing  to  the  student  so  that  he  will 
realize  his  own  Inadequacy  and  switch  to  a  better  method  (Carr  &  Goldstein,  1977;  Burton  & 
Brown,  1979;  Norman,  Gentner,  and  Stevens,  1976).  This  new  direction  is  based  on  attempts 
to  make  a  bug  "constructive"  by  establishing  for  the  student  that  there  is  somethmg 
inadequate  in  his  approach,  and  giving  enough  information  so  that  the  student  can  use  what 
he  already  knows  to  focus  on  the  bug  and  characterize  it  so  that  he  avoids  this  failing  in  the 

future. 

However,  it  is  by  no  means  clear  how  "just  the  right  thing"  is  to  be  said  to  the  student. 
We  do  know  that  it  depends  on  having  a  very  good  modei  of  his  understanding  process  (the 
methods  and  strategies  he  used  to  construct  a  solution).  Current  research  is  focussing  on 
means  for  representing  and  isolating  the  bugs  themselves  (Stevens,  Collins,  &  Goldin,  1978; 
Brown  &  Burton,  1978). 

Another  approach  is  to  provide  an  environment  that  encourages  the  student  to  think  in 
terms  of  debugging  his  own  knowledge,  in  one  BiP  experiment  (Wescourt  and  Hemphill, 
1978),  explicit  debugging  strategies  (for  computer  programming)  were  conveyed  in  a  writ  en 
document  and  then  a  controlled  experiment  was  undertaken  to  see  whether  this  tramgmg 
fostered  a  more  rational  approach  for  detecting  faulty  use  of  (programming)  skills. 

Brown,  Coiilns,  and  Harris  (1978)  suggest  that  one  might  foster  the  ability  to  construct 
hypotheses  and  test  them  (the  basis  of  understanding  in  their  model)  by  setting  up  problems 
in  which  the  student's  first  guess  is  iikeiy  to  be  wrong,  thus  "requiring  him  to  focus  on  how 
he  detects  that  his  guess  is  wrong  and  how  he  then  intelligently  goes  about  revising  it. 

The  Socratic  method  used  in  WHY  (Stevens  &  Collins,  1977)  involves  questioning  the 
student  in  a  way  that  will  encourage  him  to  reason  about  what  he  knows  and  thereby  modify 
his  conceptions.  The  tutor's  strategies  are  constructed  by  analyzing  protocols  of  reai-world 
student/teacher  interactions. 
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Another  teaching  strategy  that  has  been  successfully  implemented  on  several  systems 
is  called  coaching  (Goldstein,  1977).  Coaching  programs  ere  not  concerned  with  covering  a 
predetermined  lesson  plan  within  a  fixed  time  (in  contrast  with  SCHOLAR).  Rather,  the  goo' 
of  coaching  Is  to  develop  the  acquisition  of  skill  and  general  problem-solving  abilities,  and  it 
works  by  engaging  the  student  In  a  computer  game  (see  Article  A).  In  a  coaching  situation, 
the  immediate  aim  of  the  student  Is  to  have  fun,  and  skill  acquisition  is  an  indirect 
consequence.  Tutoring  comes  about  when  the  computer  coach,  which  is  "observing"  the 
student's  play  of  the  game,  Interrupts  him  and  offers  new  information  or  suggests  new 
strategies.  A  successful  computer  coach  must  be  able  to  discern  what  skills  or  knowledge 
the  student  might  acquire,  based  on  his  playing  style,  and  to  judge  effective  ways  to 
Intercede  in  the  game  and  offer  advice.  WEST  and  WUMPUS  (Articles  C4  and  C5)  are  both 
coaching  programs. 

Socratic  tutoring  and  coaching  represent  different  styles  for  communicating  with  the 
student.  All  mixed-initiative  tutoring  Involves  following  some  dialogue  strategy,  which 
involves  decisions  about  when  and  how  often  to  question  the  student  and  methods  for 
presentation  of  new  material  and  review.  For  example,  a  coaching  program,  by  design,  is 
non-intrusive  and  only  rarely  lectures.  On  the  other  hand,  a  Socratic  tutor  questions 
repetitively,  requiring  the  student  to  pursue  certain  iines  of  reasoning.  Recently  ICAI 
research  has  turned  to  making  explicit  these  alternative  dialogue  management  principles. 
Collins  (1976)  has  pioneered  the  careful  investigation  and  articulation  of  teaching 
strategies.  Recent  work  has  explored  the  representation  of  these  strategies  as  production 
rules  (see  Clancey,  1979a  and  Article  C2  on  Collins  and  Stevens'  WHY  system). 

For  example,  the  tutoring  module  in  the  GUIDON  program,  which  discusses  MYCIN-iike 
"case  diagnosis"  tasks  with  a  student  (see  Clancey,  1979a,  and  Article  Cl  on  MYCIN),  has  an 
explicit  representation  of  discourse  knowledge.  Tutoring  rules  select  alternative  dialogue 
formats  on  the  basis  of  economy,  domain  logic,  and  tutoring  or  student  modeling  goals. 
Arranged  into  procedures,  these  rules  cope  with  various  recurrent  situations  in  the  tutorial 
dialogue,  for  example:  introducing  a  new  topic,  examining  a  student's  understanding  after  he 
asks  a  question  that  Indicates  unexpected  expertise,  relating  an  Inference  to  one  just 
discussed,  giving  advice  to  the  student  after  he  makes  a  hypothesis  about  a  subproiuom,  and 
wrapping  up  the  discussion  of  a  topic. 


Conclusion 

In  genera!,  ICAI  programs  have  only  begun  to  deal  with  the  problems  of  representing 
and  acquiring  teaching  expertise  and  of  determining  how  this  knowledge  should  be  integrated 
with  general  principles  of  discourse.  The  programs  described  In  the  articles  to  follow  have  ali 
Investigated  some  aspect  of  this  problem,  and  none  offer  an  "answer"  to  tiie  question  of  how 
to  build  a  computer-tutor.  Nevertheless,  these  programs  have  demonstrated  potential 
tutorial  skill,  sometimes  often  showing  striking  Insight  Into  students'  misconceptions. 
Research  continues  toward  making  viable  AI  contributions  to  computer-based  education. 
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C.  ICAI  Systems 
Cl.  SCHOLAR 

An  important  aspect  of  tutoring  is  the  ability  to  generate  appropriate  questions  for  the 
student.  These  questions  can  be  used  by  the  tutor  to  indicate  the  relevant  material  to  be 
learned,  to  determine  the  extent  of  a  student's  knowledge  of  the  problem  domain,  and  to 
identify  any  misconceptions  that  he  might  have.  Given  that  the  knowledge  base  of  a  tutoring 
program  can't  contain  all  of  the  "facts"  that  are  true  about  the  domain,  the  tutor  should  be 
able  to  reason  about  what  it  knows  and  make  plausible  inferences  about  facts  in  the  domain.  In 
addition  to  responding  to  the  student's  questions,  the  tutor  should  be  able  to  take  the 
initiative  during  a  tutoring  dialogue  by  generating  good  tutorial  questions. 


SCHOLAR  .  one  such  mixed-initiative  computer-based  tutorial  system;  both  the  system 
and  the  student  can  initiate  conversation  by  asking  questions.  SCHOLAR  was  the  pioneering 
effort  in  the  development  of  computer  tutors  capable  of  ccplng  with  unanticipated  student 
questions  and  of  generating  subject  matter  in  varying  levels  of  detail,  according  to  the 
context  of  the  dialogue.  Both  the  student's  Input  and  the  program's  output  are  in  English 
sentences. 

The  original  system,  created  by  Jaime  Carbonell,  Allan  Collins,  and  their  colleagues  at 
Bolt,  Beranek  and  Newman,  Inc.,  tutored  students  about  simple  facts  in  South  American 
geography  (Carbonell,  1970b).  SCHOLAR  uses  a  number  of  tutoring  strategies  for  composing 
relevant  questions,  determining  whether  or  not  the  student's  answers  are  correct,  and 
answering  questions  from  the  student.  Both  the  knowledge  representation  scheme  (see 
below)Aand  the  tutorial  capabilities  are  applicable  to  other  domains  besides  geography.  For 
example,  NLS-SCHC..AR  was  developed  to  tutor  computer-naive  people  in  the  use  of  a 
complex  text-editing  program  (Grignetti,  Hausman,  &  Gould,  1976). 

In  addition  to  investigating  the  nature  of  tutorial  dialogues  and  human  plausible 
reasoning,  the  SCHOLAR  research  project  explored  a  number  of  Al  Issues,  Including: 

1.  How  can  real-world  knowledge  be  stored  effectively  for  the  fast,  easy 
retrieval  of  relevant  facts  needed  in  tutoring? 

2.  What  general  reasoning  strategies  are  needed  to  make  appropriate  Inferences 
from  the  typically  incomplete  database  of  the  tutor  program? 

3.  To  what  extent  can  these  strategies  de  Independent  of  the  domain 

being  discussed  (l.e.,  be  dependent  or  orm  of  the  representation)? 


The  Knowledge  Base— Semantic  Nets 

In  SCHOLAR,  knowledge  about  the  domain  being  tutored  is  represented  In  a  semantic  net 
(see  Article  Repreaentation.B2).  Each  node  or  "unit"  In  the  net,  corresonding  to  some 
geographical  object  or  concept,  Is  composed  of  the  name  associated  with  that  node  and  a 
set  of  properties.  These  properties  are  lists  of  attribute-value  pairs.  For  example,  Figure  1 
showa  a  representation  of  the  unit  for  Peru: 


Cl 


SCHOLAR 


1 1 


PERU: 

((EXAMPLE-NOUN  PERU)) 

(I  0) 


” importance "  of  unit  is  high 


(SUPERC  (I  0)  COUNTRY) 

(SUPERP  (I  6)  SOUTH/AMERICA) 

link  to  superordinate  units 


(LOCATION  (I  0) 


values  of  LOCATION  attribute  follow: 


(IN  (I  0)  (SOUTH/AMERICA  (I  0)  WESTERN)) 

(ON  (I  0)  (COAST  (I  0)  (OF  (I  0)  PACIFIC)) 

(LATITUDE  (I  4)  (RANGE  (I  0)  -18  0)) 

(LONGITUDE  (I  5)  (RANGE  (I  0)  -82  -68)) 
(BORDERING/COUNTRIES  (M) 

(NORTHERN  (I  1)  (LIST  COLUMBIA  ECUADOR)) 
(EASTERN  (I  1)  BRAZIL) 


Figure  1.  The  unit  for  PERU. 

Attributes  celt  be  English  words  (other  units)  that  are  defineb  elsewhere  in  the  net  or  one  ot 
several  special  attributes  such  as  "SUPRA"  for  superattnbute,  'SUPERC  for  superconcepi 

or  superordinate,  "SUPERP"  for  superpart,  or  case  structure  ®tlllr,bU!®® inland  "soil" 
below).  An  example  of  SUPRA  might  be  the  fact  that  "fertile'  refer.  »«  -  d  50  ' 

refers  to  "topography"!  of  SUPERP,  that  Peru  Is  part  of  South  Amenca;  of  SUPERC,  P 
Is  a  country.  Values  can  also  be  importance  tags,  like  the  expresses  (I  0)  after  LOCATION 
In  Figure  1  and  "(I  1 )"  after  EASTERN:  the  lower  the  number,  the  more  important  the  p  op  ty. 
SCHOLAR  uses  these  tags  to  measure  the  relevance  of  a  node  with  respect  to  the  topic 

under  discussion  (see  below). 


Tutorial  Dialogue 

An  Important  part  of  the  SCHOLAR  research  has  been  an  emphasis  on  the  nature  of 
tutorial  dialogue,  In  particular  the  Socratlc  style  of  tutoring,  where  the  tutor  a .  tempts  to 
diaanose  the  student's  misconceptions  and  then  present  material  that  will  force  the  stuci 
^30^1113  own  errors  (Collins,  Warnock,  8,  Passaflume,  1974).  Many  human  tutorial  protocols 
have  been  analysed  to  see  what  strategies  human  students  follow,  in  order  to  try  to 
incorporate  these  strategies  In  the  system  (see  also  Article  C2  on  the  groups  con  mu  mg 
research  In  this  direction).  The  following  example  Is  an  excerpt  fro™  8  y 

South  American  geography  tutor.  The  student's  questions  and  answers  are  preceded  by 

and  annotations  are  shown  In  italics : 
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What  is  tiie  language  of  Argentina? 

**  PROBABLY  SPANISH 

Very  good. 

Approx  what  is  the  area  of  Argentina? 

Student  changes  topic... 

**  TELL  ME  SOMETHING  ABOUT  PERU 

Peru  Is  a  country. 

it  Is  located  In  South  America 

The  capital  of  Peru  is  Lima. 

Now  answer  the  question  you  didn't  answer  before. 

...and  finally  answers  the  question. 

**  THE  AREA  OF  ARGENTINA  IS  1,100,000  SQUARE  MILES 

Approximately  true,  but  a  more  exact  value  Is  1,200,000. 

The  eastern  bordering  country  of  Peru  is 

Student  makes  an  error  and  is  corrected... 

"*  CATHOLICISM 

Wrong.  Catholicism  Is  not  a  country  but  a  religion. 

The  correct  answer  Is:  Brazil 
Where  is  Peru? 

...and  once  again  changes  topic. 

**  Is  Asuncion  hot? 

Since  Asuncion  is  part  of  Paraguay,  yes. 

**  WHAT  IS  THE  MOST  IMPORTANT  DIFFERENCE  BETWEEN  PERU  AND  PARAGUAY? 

The  primary  difference  is:  The  population  of  Peru  is  larger  by  a 
factor  of  7.8. 

The  natural  language  processing  used  by  SCHOLAR  is  based  on  a  case  grammar  system 
(Article  Natural  LanguaQ9.B3b),  and  the  semantic  Interpretation  of  a  student's  response  is 
heavily  guided  by  the  geography  knowledge  In  the  semantic  net.  Natural  ianguage 
processing  is  Important  In  SCHOLAR  because  It  allows  the  student  to  take  the  initiative  and 
ask  questions  that  were  not  anticipated  by  the  program.  However,  comprehension  of  English 
input  Is  facilitated  by  a  taxonomy  of  expected  "question  types."  The  case  grammar  Is  aiso 
used  by  the  system  to  constmt  questions  and  presentations  of  new  material  from  the 
semantic  network.  English  output  is  composed  of  short,  simple  sentences,  with  no  embedded 
clauses  and  a  iimlted  repertoire  of  verbs--generaiiy  some  form  of  the  verb  "to  be." 

A  simple  agenda  Is  used  to  keep  track  of  topics  that  are  being  discussed.  Timing 


(  1 


SCHOLAR 
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considerations  and  relevance  (importance  tags)  affect  the  generation  and  pruning  of  topics 
on  this  agenda.  Continuity  between  questions  is  weak,  however,  since  SCHOLAR  does 
plan  a  series  of  questions  to  make  a  point.  SCHOLAR  is  capable  of  diagnosing  a  student  s 
confusion  only  by  following  up  one  question  with  a  related  question. 


Making  Inferences 

SCHOLAR'S  Inference  strategies,  for  answering  student  questions  and  evaluating 
student  answers  to  Its  questions,  are  designed  to  cope  with  the  incompleteness  of  the 
information  stored  in  the  semantic  net  database.  Some  of  the  Important  strategies  used  o 
reason  with  incomplete  knowledge  are  given  below.  These  abilities  have *  been  explored 
further  in  current  research  dealing  with  default  reasoning  (Relther,  1978)  and  plausible 
reasoning  (Coiiins,  1978). 

Intersection  search.  Answering  questions  of  the  form  "Can  X  be  a  Y?"  (e .g.,  "is  Buenos 
Aires  a  city  in  Brazil?")  is  done  by  an  Intersection  search:  The  superconcept  (SUPERC) 
of  both  nodes  for  X  and  Y  are  traced  until  an  intersection  is  found  <1.0. ,  ^  common 
superconcept  node  is  found),  if  there  is  no  intersection,  the  answer  Is  NO.  if  there 
intersection  node  Q,  SCHOLAR  answers  as  follows: 

if  Q=Y,  then  "YES"; 

If  Q=X,  then  "NO,  Y  IS  AN  X." 

For  example,  the  question  "Is  Buenos  Aires  In  Brazil?"  Is  answered  YES  because  Brazil  is  a 
SUPERC  of  Buenos  Aires  In  the  net  (Q=Y): 

SOUTH  AMERICA 

/  (Superconcept) 

BRAZIL  (Y) 

( 

(Superconcept) 

BUENOS  'AIRES  (X) 

But,  the  question  "Is  Brazil  In  Buenos  Aires?"  gets  the  response  "NO,  BRAZIL  is  a  country." 

SOUTH  AMERICA 

^  (Superconcept) 

BRAZIL  (X) 

/ 

(Superconcept) 

BUENOS' AIRES  (Y) 


Common  superordinate.  Otherwise,  If  Q  is  not  X  or  Y,  the  program  focuses  on  the  two 
elements  that  have  Q  as  a  common  superwdinate.  If  they  are  contradictory  (contain  suitable 


14 


Ai  Applications  In  Education 


CONTRA  properties)  or  have  distinguishing,  mutually  exclusive  properties  (e.g.  different 
LOCATIONS),  the  answer  Is  "NO";  otherwise  the  system  answers  "I  DON'T  KNOW,"  Answering 
"is  X  a  part  of  Y?"  questions  Is  similar,  except  SUPERP  (superpart)  arcs  are  used  for  thn 
intersection  process. 

Open  and  closed  sets.  In  order  to  look  for  all  objects  In  the  system  that  satisfy  some 
condition  (e.g,,  "How  many  cities  in  Columbia  are  on  the  Atlantic?"),  a  distinction  must  be 
made  about  whether  the  resulting  set  of  these  objects  is  closed  (explicitly  contains  all  such 
objects)  or  open  (contains  some  and  need  not  contain  all  such  objects).  In  SCHOLAR'S  net, 
sets  are  tagged  by  the  course  author  as  either  open  or  closed,  thus  giving  an  indication  of 
the  incompleteness  of  the  system's  knowledge. 

While  it  Is  easy  for  SCHOLAR  to  answer  questions  (such  as,  "Is  France  a  country  in 
South  America?")  about  closed  sets,  similar  questions  about  open  sets  require  the  use  of 
special  technlques—SCHOLAR  must  be  able  to  reason  about  the  extent  of  Its  knowledge.  For 
example,  If  we  ask  SCHOLAR  whether  or  not  rubber  Is  a  product  of,  say,  Guyana  and  if  there 
is  no  explicit  information  about  Guyanese  rubber  production  in  the  semantic  net,  SCHOLAR 
starts  to  reason  with  what  It  knows:  It  compares  the  importance  tags  of  the  agricultural 
products  of  Guyana  with  the  Importance  tags  for  rubber  in  countries  that  it  knows  produce 
rubber.  It  happens  that  Peru  Is  a  known  rubber  producer.  SCHOLAR  looks  at  the  l-tag  of 
rubber  for  Peru  and  compares  this  tag  to  the  largest  (least  Important)  l-tags  associated  with 
Guyana.  If  It  finds  that  these  are  comparable,  SCHOLAR  infers  that  It  knows  as  much  about 
Guyana  (up  to  some  large  l-tag  value)  as  It  knows  about  Peru.  Therefore  SCHOLAR  infers 
that  It  would  know  about  rubber  production  In  Guyana  If  It  were  important.  S.nce  rubber  is 
not  listed  for  Guyana,  SCHOLAR  makes  the  plausible  (though  uncertain)  conclusion  that  rubber 
Is  not  produced  there.  At  this  point  SCHOLAR  answers  that  it  doesn't  know  for  certain  and 
gives  the  Information  about  rubber  and  Peru. 

SCHOLAR'S  use  of  knowledge  about  the  extent  of  Its  knowledge  in  this  kind  of  plausible 
reasoning  is  unique  In  Ai  research  and  represents  an  application  of  meta-level  knowledge  (see 
Representatian.Overview). 


Summary 

The  Inferenclng  strategies  used  by  SCHOLAR  are  Independent  of  the  content  of  the 
semantic  net,  and  are  applicable  In  different  domains.  The  Inferences  produced  are  fairly 
natural;  that  Is,  they  cope  with  the  incomplete  knowledge  by  employing  reasoning  processes 
similar  to  those  that  people  use.  The  SCHOLAR  project  as  a  whole  provides  an  ongoing 
environment  for  research  on  discourse,  teaching  strategies,  and  human  plausible  reasoning 
(see  Article  C3  on  recent  research,  Including  the  WHY  system). 


References 

Carbonell  (1970a)  is  a  classic  paper,  defining  the  field  of  iCAl  and  introducing  the 
SCHOLAR  system  Collins  (1976)  is  an  Illuminating  study  of  human  tutorial  dialogues.  Collins 
et  al.  (1976)  discusses  Inference  mechanisms,  and  Collins  (1978)  reports  extended 
research  on  human  plausible  reasoning,  Grignettl,  Hausman,  &  Gouid  (1976)  describes  NLS- 
SCHOLAR. 
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C2 


C2.  WHY 

Recent  research  .V  Aiian  Co. ns.  A,  ~ 

Beranek  and  Newman,  Inc.,  has  focused  S9CH0LAR  (Article  Cl),  a  system  that 

discuss  complex  systems.  Their  preV  °u  ,  d  them  to  investigate  the  nature  of  tutorial 
tutors  facts  about  South  American  geog  ap  y,  ua|— where  the  causal  and  temporal 

dialogues  about  subject  matter  that  was JZerest  and  where  student's 
Interrelations  between  the  c°ncept*  .  b  t  a|so  misconceptions  about  why  processes 
errors  could  Involve  not  only  or9°  (1977)  are  building  a  new  system,  called  WHY,  that 

work  the  way  they  do.  Stevens  A  Colhns  (1 ‘  geophysical  process  that  Is  a  function  of 

X:XSl factor  can  be  Isolated  that  is  both  necessary  and 
sufficient  to  account  for  rainfall. 

1978): 

■-  r..:::.ir»r: = s  =“  s 

below.) 

,  .tijonie  hnyo'?  How  do  tutors  diagnose 
2  What  types  of  misconceptions  do  students  hav  . 

'  these  misconceptions  Iran,  the  errors  students  make? 

3.  What  are  the  abstractions  and  viewpoints  that  tutors  use  to  exp.ain  physical 
processes? 

By  analyzing  tutorial  dialogues  be|*e^  into  a  tutorial  program, 

Identify  elements  of  a  theory  of  tutonng,  further  investigation.  The 

„h,ch  Is  then  used  to  find  'J-  hr.  rl«  Ot  a  series  o.Vera, ions  o,  this  sort.  The  work 

sor;:,n'h.VsercSonce0n,,,ldHon  ^  hrst  topic  above,  the  nature  o,  Socr.tic  tutoring. 


Socratic  Tutoring  Heuristics 

Collins  (1976)  argues  that  learning ^  caTeTa"*  trying' to  ge^eTahze 
Is  best  accomplished  by  dealing  wl.h  sp  P '  f(jr  tutor|ng  complex  subjects  where 

from  them.  Socratic  dialogue  Is  accounts  for  the  phenomenon  under 

factors  Interact  and  where  nature  of  the  Socratic  dialogue  the  current 

consideration.  In  an  effort  to  explicit  y  heuristics  which  control  the  student/system 

ve'sion  of  the  WHY  system  Incorporates  24  heuristics  wn 

Interaction.  An  example  heuristic  Is: 

,f  the  student  gives  as  an  explanation  of  causa,  dependence  one  or  more 
factors  that  are  not  necessary, 

Pick  .  counterexample  with  the  wrong  v»;ue  o<  «h"  '»ctor  a„d  aak  the 
student  why  his  causal  dependence  doesn'  t  hold  in  that  case. 
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This  rule  forces  the  student  to  consider  the  necessity  of  a  particular  factor.  For  example,  if 
the  student  gives  rainfall'  as  a  reason  for  growing  rice,  then  the  computer-generated 
counterexample  "Why  do  they  grow  rice  In  Egypt,  where  there  isn't  much  rainfall?" 
challenges  he  student's  explanation  of  rice  growing.  These  heuristic  rules  are  designed  to 
facilitate  tutorial  dialogues  In  which  students  must  consider  combinations  of  factors  that  are 
necessary  for  rainfall,  eliminate  Irrelevant  factors,  and  attempt  to  generalize  from  specific 
cases  to  general  conditions.  (See  Collins,  1976,  for  a  complete  discussion  of  the  tutoring 
rules.) 

WHY's  knowledge  base  about  rainfall  is  represented  in  script-tike  data  structures,  which 
encode  temporal  relations  In  the  rainfall  process;  for  example,  "First  water  evaporates  f-om 
an  ocean  or  sea,  then  the  water-laden  air  mass  Is  carried  over  land  by  winds,  which  in  turn 
causes  the  air  mass  to  cool,  which  causes  precipitation."  (See  Schank  &  Abelson,  1977,  on 
scripts,  as  well  as  Article  Representation.B7.)  This  knowledge  representation  is  adequate  for 
capturing  many  of  the  characteristics  of  a  tutorial  dialogue,  but  there  are  other  kinds  of 
knowledge  about  rainfall  that  aren't  represented  here,  discussed  below. 

The  following  excerpt  Is  from  a  dialogue  with  the  current  version  of  the  WHY  system. 
The  system  prompts  the  student  to  suggest  causes  of  rainfall,  to  look  for  prior  or 
intermediate  causes,  and  finally  to  suggest  a  general  rule.  When  such  a  rule  is  proposed,  the 
system  finds  a  counterexample  and  forces  revision  of  the  rule  to  account  for  it.  The  system 
proceeds  in  this  way  until  a  rule  of  sufficient  generality  Is  found.  The  protocol  is  annotated 
with  the  heuristics  employed  by  the  system  at  each  juncture  in  the  conversation. 

Heuristic  1:  Ask  for  a  prediction  about  a  particular  case. 

Do  you  think  the  Amazon  jungle  has  heavy  rainfall  or  light  rainfall? 

**  HEAVY  RAINFALL 

Heuristic  2:  Ask  for  prior  causes. 

Why  doe£  the  Amazon  jungle  have  heavy  rainfall? 

**  BECAUSE  ITS  NEAR  THE  ATLANTIC 

Yes,  the  Amazon  jungle  Is  near  the  equatorial  Atlantic. 

Heuristic  3:  Ask  for  intermediate  causes. 

How  does  that  affect  rainfall  In  the  Amazon  Jungle? 

**  THE  WATER  IN  THE  CURRENT  IS  WARM 

(Section  Omitted) 


Heuristic  4:  Ask  for  subsequent  causes. 

What  happens  after  the  moisture-filled  air  reaches  the  land? 
**  I  DON'T  KNOW 

Heuristic  3:  Suggest  a  factor  and  ask  if  it  is  a  cause. 
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Do  you  think  the  Andes  mountain  range  has  any  effect  on  the  rainfall 
In  the  Amazon  Jungle? 

**  YES 


Heuristic  Ask  for  intermediate  causes. 


How? 

**  THE  PREVAILING  WIND  HITS  THE  ANDES  AND  THE  AIR  COOLS 

Heuristic  ?;  Ask  for  intermediate  causes. 

How  does  the  fact  that  prevailing  wind  hits  the  Andes,  lead  to 
air  cooling? 

.  (Section  Omitted ) 


Heuristic  6:  Form  a  general  rule  for  an  insufficient  cause 

Do  you  think  that  any  place  with  mountains  has  heavy  rainfall? 

**  YES 

Heuristic  7:  Pick  a  counterexample  for  an  insufficient  cause,  and  Heuristic  2:  Ask  for 
prior  causes 

Southern  California  has  mountains.  Why  doesn't  Southern  California 
have  heavy  rainfall? 


Current  Research 

Tutorial  goals.  One  of  the  shortcomings  of  the  existing  system  Is  that  it  doesn't  have 
long-term  "goals"  for  the  tutorial  dialogue.  Implicit  In  the  tutorial  rules  Is  some  Idea  about 
local  management  of  the  Interaction,  but  a  global  strategy  about  the  tutoring  session  is 
absent.  Human  tutors,  however,  admit  to  having  goals  like  "Concentrate  on  one  particular 
part  of  the  causal  structure  of  rainfall  at  a  time,"  or  "Clear  up  one  misconception  before 
discussing  another."  Stevens  &  Collins  (1977)  set  about  codifying  hese  goals  and 
strategies  for  Incorporation  into  the  WHY  system.  They  analyzed  tutoring  protocols  in  which 
human  tutors  commented  on  what  they  thought  the  students  did  and  didn't  know,  and  on  why 
they  responded  to  the  students  as  they  did.  From  this  analysis,  two  top-level  goals  became 

apparent: 

1.  Refine  the  student's  causal  structure,  starting  with  the  most  important 
factors  In  a  particular  process  and  gradually  incorporating  more  subtle 
factors. 

2.  Refine  the  student's  procedures  for  applying  his  causal  model  to  novel 
situations. 
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Student  misconceptions.  The  top-level  goals  involve  subgoals  of  identifying  and 
correcting  the  student's  misconceptions,  Stevens  &  Collins  (1977)  classified  these 
subgoals  Into  five  categories  corresponding  to  types  of  bugs  and  how  to  correct  them: 

Factual  Bugs.  Dealt  with  by  correcting  the  student.  Teaching  facts  Is  not 
the  goal  of  Socratlc  tutoring;  interrelationships  of  facts  are  more  important. 

Outside-domain  bugs.  Misconceptions  about  causal  structure,  which  the 
tutor  chooses  not  to  explain  in  detail.  For  example,  the  "correct"  relationship 
between  the  temperature  of  air  and  its  moisture  holding  capacity  Is  often  , 
stated  by  the  tutor  as  a  fact,  without  futher  explanation. 

Overgenerallzatlon.  When  a  student  makes  a  general  rule  from  an 
Insufficient  set  of  factors  (e.g.,  any  place  with  mountains  has  heavy  rainfall), 
ti.e  tutor  will  find  counterexamples  to  probe  for  more  factors. 

Overdifferentiation.  When  a  student  counts  factors  as  necessary  when 
they  are  not,  the  tutor  will  generate  counterexamples  to  show  that  they  are 
not. 

Reasoning  bugs.  Tutors  will  attempt  to  teach  students  skills  such  as  forming 
and  testing  hypotheses  and  collecting  enough  information  before  drawing  a 
conclusion. 

If  a  student  displays  more  than  one  bug,  human  tutors  will  employ  a  set  of  heuristics  to 
decide  which  one  to  correct  first: 

1.  Correct  errors  before  omissions. 

2.  Correct  causally  p  lor  factors  before  later  ones. 

3.  Make  short  corrections  before  longer  ones. 

4.  Correct  low-level  bugs  (In  the  causal  network)  before  correcting  higher 
level  ones. 

Functional  relationships.  The  bugs  just  discussed  are  ail  domain  independent,  that  is, 
they  wouid  occur  In  tutorial  dialogues  about  other  complex  processes  besides  rainfall.  But 
some  bugs  are  the  resuits  of  specific  misconceptions  about  the  functional  interrelationships 
of  the  concepts  of  the  specific  domain.  For  example,  one  common  misconception  about 
rainfall  Is  that  "cooling  causes  air  to  rise"  (Stevens,  Collins,  &  Goldin,  1978).  This  Is  not  a 
simple  factual  misconception,  nor  Is  it  domain  independent.  It  is  best  characterized  as  an 
error  In  the  student's  functional  model  of  rainfall. 

In  fact,  the  script  representation  used  In  the  WHY  system  for  capturing  the  temporal 
and  causal  relations  of  land,  air,  and  water  masses  In  rainfall  proved  Inadequate  to  get  at  all 
of  the  types  of  student  misconceptions.  Recent  work  has  Investigated  a  more  flexible 
representation  of  functional  relationships,  which  allows  the  description  of  the  processes  that 
collectively  determine  rainfall  from  multiple  vlewpoints--e.g.,  temporal-causal-subprocess  view 
captured  in  the  scripts,  functional  viewpoint  which  emphasizes  the  roles  that  different 
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objects  play  in  the  various  processes  (Stevens,  Coliins,  &  Goldin,  1978).  Misconceptions 
about  rainfall  are  represented  as  errors  In  the  student's  model  of  these  relationships  A 
functional  relationship  has  four  components:  (a)  a  set  of  actors,  each  with  a .role  in  e 
process;  (b)  a  set  of  factors  that  affect  the  process--the  factors  are  all  attributes  of  the 
actors  (e.g.,  water  is  an  actor  In  the  Evaporation  relationship  and  Its  tern .perature  Is  a 
factor);  (c)  the  result  of  the  process-thls  Is  always  a  change  m  an  attribute  of  one  of  the 
actors;  and  (d)  the  relationship  that  holds  between  the  actors  and  the  result,  or  how  an 
attribute  gets  changed.  These  funtlonal  relationships  may  be  the  result  of  models ^from  o  her 
domains  that  are  applied  metaphorically  to  the  domain  under  discussion  (Stevens  &  Collins, 

1978). 


Summary 

The  WHY  system  started  as  an  extension  of  SCHOLAR  by  the  implementation  of  rules 
that  characterize  Socratic  tutoring  heuristics.  Subsequently,  an  effort  was  made  to  describe 
the  global  strategies  used  by  human  tutors  to  guide  the  dialogue.  Since  these  were  directed 
towards  dispelling  students'  misconceptions,  five  classes  of  misconceptions 
established,  as  well  as  means  for  correcting  them.  Many  misconceptions  are  not  do™  n 
independent  and  the  key  to  more  versatile  tutoring  lies  In  continuing  research  on  knowledge 

representation. 


References 

The  most  recent  reference  on  the  research  reported  here  Is  Btevens.  Conins,  &  Goldin 

(1978).  The  tutorial  rules  are  discussed  fully  In  an  excellent  article  by  Collins  (  )• 

later  work  on  the  goal  structure  of  a  tutor  Is  reported  In  Stevens  &  Collins,  1977.  Finally, 
recent  work  on  conceptual  models  and  multiple  viewpoints  of  complex  systems  Is  discussed 

in  Stevens  &  Collins  (1978). 
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C3.  SOPHIE 

SOPHIE  (a  SOPHIsticated  Instructional  Environment)  is  an  ICAI  system  developed  by 
John  Seely  Brown,  Richard  Burton,  and  their  colleagues  at  Bolt,  Beranek  and  Newman,  Inc.,  to 
explore  the  objective  of  a  wider  range  of  student  initiatives  during  the  tutorial  interaction 
(Brown,  Burton,  &  Beil,  1975).  The  SOPHIE  system  provides  the  student  with  a  learning 
environment  in  which  he  learns  problem-solving  skills  by  trying  out  his  ideas,  rather  than  by 
instruction.  The  system  has  a  model  of  the  problem-solving  knowledge  in  its  domain  as  well 
os  numerous  heuristic  strategies  for  answering  the  student's  questions,  criticizing  his 
hypotheses,  and  suggesting  alternative  theories  for  his  current  hypotheses.  SOPHIE  enables 
the  student  to  have  a  one-to-one  relationship  with  an  "expert"  who  helps  him  create  his  own 
Ideas,  experiment  with  these  ideas  and,  when  necessary,  debug  them. 

Figure  1  Illustrates  the  component  modules  of  the  original  SOPHIE-1  system  (Brown, 
Rubinstein,  &  Burton,  1976)  and  the  additional  capabilities  added  for  the  SOPHIE-1 1  system, 
discussed  later  In  this  article. 


Figure  1.  SOPHIE-1  and  SOPHIE-II. 


C3 


SOPHiE 


21 


SOPHIE-1 

SOPHiE  teaches  probiem-solving  skills  in  the  context  of  a  simulated  electronics 
laboratory.  The  probiem  facing  the  student  is  to  find  the  "faults"  In  a  malfunctioning  piece  of 
equipment  whose  characteristics  have  been  given  to  him  by  taking  "measurements."  The 
student  obtains  a  sequence  of  measurements  (such  as  voltages,  currents,  resistances,  etc.) 
to  determine  what  is  wrong  with  the  equipment  and  asks  questions  about  the  measurements 
and  about  what  fault  hypotheses  he  should  consider,  given  the  Information  he  has  discovered 
so  far.  SOPHiE  evaluates  the  hypotheses  by  considering  all  of  the  information  that  the 
student  should  have  been  abie  to  derive  from  his  current  set  of  measurements.  The  system 
also  judges  the  merits  of  a  student's  suggestion  for  a  new  measurement  in  iight  of  the  prior 
sequence  of  measurements.  SOPHIE  can  decide  If  a  measurement  is  valuable,  depending  on 
whether  It  eliminates  or  Isolates  a  fault,  and  will  inform  the  student  when  a  proposed 
measurement  provides  no  new  Information. 

The  foliowing  portion  of  a  sampie  protocol  illustrates  some  of  the  linguistic  and 
inferential  capabilities  of  the  SOPHIE  system  (from  Brown  &  Burton,  1975).  SOPHIE  randomly 
selects  a  fault  (In  the  performance  of  a  circuit  component),  inserts  it  into  a  modei  or 
simulation  of  the  circuit,  and  teils  the  student  how  the  controls  are  set.  The  student  is  given 
a  schematic  diagram  of  the  circuit,  and  the  dialogue  refers  to  components  in  tiie  schematic 
diagram  (e.g.,  resistor  8  (R8),  node  connection  1  (N1),  capacitor  5  (C6)).  The  student  begins 
his  troubleshooting  by  performing  measurements  in  the  faulted  circuit. 

Welcome  to  Sophie,  a  reactive  iearning  environment.  The  piece  of 
equipment  you  are  studying  is  a  reguiated  power  supply.  Wouid  you  iike 
to  start  with  a  faulted  Instrument  that  needs  fixing? 

**  YES 

The  Instrument  has  a  fauit  and  its  present  control  settings  are: 


cc 

1.0 

CURRENT  CONTROL 

CR 

HIGH 

CURRENT  RANGE  SWITCH 

LOAD 
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LOAD  RESISTANCE  IN  OHMS 

VC 

1.0 

VOLTAGE  CONTROL 

VR 

HIGH 

VOLTAGE  RANGE  SWITCH 

**  WHAT  IS  THE  OUTPUT* 

The  output  voitage  is  1 1.7  volts. 

Notice  that  "output"  was  disambiguated  as  "output  voltage." 

**  WHAT  IS  IT  IN  A  WORKING  INSTRUMENT? 

In  a  working  circuit  the  output  voitage  Is  19.9  volts. 

In  case  the  student  does  not  know  what  the  required  value  of  measurement  should  have 
been,  he  can  ask  for  any  measurement  in  a  properly  functioning  instrument.  'l*  student 
then  requests  an  internal  measurement. 

**  WHAT  IS  THE  VOLTAGE  BETWEEN  R8  AND  Q1? 
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The  voltage  at  N1  Is  47.6  volts. 


**  REPLACE  Q6 

The  student  believes  that  he  has  discovered  the  problem  (in  transistor  Q6)  and  tries  to  fix 
it.  Before  a  component  is  replaced,  however,  the  system  queries  him  as  to  how  he  thinks  it 
is  faulted. 


I  am  going  to  ask  you  some  questions  about  how  Q6  is  faulted. 

Are  any  junctions  shorted? 

**  I  DON'T  KNOW 

Are  any  terminals  open? 

**  YES 

Which  termlnal(s)  are  open? 

**  THE  BASE 

Q6  Is  not  faulted  In  that  manner.  Q6  not  replaced. 

After  learning  that  Q6  does  not  have  an  open  base,  the  student  asks  if  this  was  at  least  a 
reasonable  hypothesis. 

**  IS  IT  POSSIBLE  THAT  Q6  HAS  AN  OPEN  BASE? 

Before  the  system’s  evaluation  procedures  can  be  applied  to  the  student  s  idea  or 
hypothesis,  the  hypothesis  must  h  completely  specified.  To  help  the  student  out,  the 
system  suggests  a  plausible  complete  specification  of  it  in  an  abbreviated  form  familiar  to 
the  student. 

Is  the  complete  fault  mode  EB/EC/BC  =  Open/Ok/Open  what  you  intended? 

**  SURE 

That  fault  is  consistent  with  all  of  the  measurements  you  have  made. 

The  hypothesis  is  reasonable,  but  other  measurements  will  indicate  that  another  component 
is  faulty. 
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Natural  Language  Proce$sing--Semantic  Grammar 

In  an  instructional  environment,  It  is  Important  that  the  student  be  provided  with  a 
convenient  way  in  which  to  communicate  his  ideas  to  the  system.  The  student  wil  become 
quickly  frustrated  If  he  has  to  try  several  ways  of  expressing  an  Idea  before  the  system  can 
understand  It.  SOPHIE'S  natural  language  understander  copes  with  various  linguistic  problems 
such  as  anaphoric  references  and  context-dependent  deletions  and  ellipsis,  which  occur 
frequently  In  natural  dialogues. 

SOPHIE'S  natural  language  capabilities  are  based  on  the  concept  of  a  semantic  grammar 
In  which  the  usual  syntactic  categories  such  as  noun,  verb,  and  adjective  are  replaced  by 
semantically  meaningful  categories  (Burton,  1976b,  and  Burton  and  Brown,  1979b).  These 
categories  represent  concepts  known  to  the  system--such  as  "measurements,  circuit 
elements,"  "transistors"  and  "hypotheses."  For  each  concept  there  Is  a  grammar  rule  that 
gives  the  alternate  ways  of  expressing  that  concept  In  terms  of  Its  constituent  concepts. 
Each  rule  Is  encoded  as  a  LISP  procedure  that  specifies  the  order  of  application  of  the 
various  alternatives  In  each  rule. 

A  grammar  centered  around  semantic  categories  allows  the  parser  to  deal  with  a 
certain  amount  of  "fuzziness"  or  uncertainty  in  Its  understanding  of  the  words  in  a  given 
statement;  that  is,  If  the  parser  Is  searching  for  a  particular  instantiation  of  a  semantic 
category,  and  the  current  word  In  the  sentence  falls  to  satisfy  this  instantiation,  it  skips 
over  that  word  and  continues  searching.  Thus,  if  the  student  uses  certain  words  or  concepts 
that  the  system  doesn't  know,  the  parser  can  ignore  these  words  and  try  to  make  sense  of 
what  remains.  In  order  to  limit  the  negative  consequences  that  may  result  from  a 
misunderstood  question,  SOPHIE  responds  to  the  student's  question  with  a  full  sentence  that 
tells  him  what  question  Is  being  answered.  (See  Article  Natural  Language.F7  about  the 
semantic  grammar  used  In  the  LIFER  system). 


Inferenclng  Strategies 

In  order  to  Interact  with  the  student,  SOPHIE  performs  several  different  logical  and 
tutorial  tasks.  Firs1,  there  Is  the  task  of  answering  hypothetical  questions.  For  example,  the 
student  might  ask,  "If  the  base-emitter  junction  of  the  voltage  limiting  transistor  opens,  then 
what  happens  to  the  output  voltage?" 

A  second  task  SOPHIE  must  perform  Is  that  of  hypothesis  evaluation,  where  the  student 
asks,  "Given  the  measurements  I  have  made  so  far,  could  the  base  of  transistor  Q3^  be 
open?"  The  problem  here  Is  not  to  determine  if  the  assertion  "the  base  of  Q3  is  open"  Is 
true,  but  whether  this  assertion  is  ioglcaily  consistent  with  the  data  that  have  already  been 
collected  by  the  student.  If  It  Is  not  consistent,  the  program  explains  why  It  Is  not.  When  it 
is  consistent,  SOPHIE  identifies  which  information  supports  the  assertion  and  which 
Information  Is  Independent  of  It. 

A  third  task  that  SOPHIE  must  perform  Is  hypothesis  generation.  In  Its  simplest  form  this 
Involves  constructing  all  possible  hypotheses  that  are  consistent  with  the  known  information. 
This  procedure  enables  SOPHIE  to  answer  questions  like,  "What  could  be  wrong  with  the 
circuit  (given  the  measurements  that  i  have  taken)?"  The  task  Is  solved  using  the  generate- 
and-test  paradigm  with  the  hypothesis  evaluation  task  described  above  performing  the  "test" 
function. 
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Finally,  SOPHIE  can  determine  whether  a  given  measurement  is  redundant ,  that  is,  if  the 
results  of  the  measurement  could  have  been  predated  from  a  complete  theory  of  the  circu  t. 
given  the  previous  measurements. 

SOPHIE  accomplishes  ail  of  these  reasoning  tasks  using  an  inference  mechanism  that 
relies  principally  on  a  general-purpose  simulator  of  the  circuit  under  d.scuss'on.  For  example, 
to  answer  a  question  about  a  changed  voltage  resulting  from  a  hypothetical  modification i  to  a 
circuit  SOPHiE  first  interprets  the  question  with  Its  parser  and  then,  using  t  is  in  erpre 
simulates  the  desired  modification.  The  result  is  a  Voltage  Table  that  represents  the 
voltages  at  each  terminal  In  the  modified  circuit.  The  original  question  is  then  answered  m 

terms  of  these  voltages. 

The  tasks  of  hypothesis  evaluation  and  hypothesis  generation  are  handled  in  a  similar 
manner,  using  the  simulator.  When  evaluating  hypotheses,  SOPHiE  attempts  to  determine 
logical  consistency  of  0  given  hypothesis.  To  accomplish  this  fas*  .  . * J  <"> "  <  ' 

hypothesis  is  performed  on  the  circuit  model  and  measurements  are  taken °'  ““  ' 

the  values  of  any  of  these  measurements  are  not  equivalent  to  the  measurements  taken  by 
the  student,  then  a  counterexample  has  been  established  and  it  is  used  to  critique  the 

student's  hypothesis. 

When  Generating  hypotheses,  SOPHiE  attempts  to  determine  the  set  of  possible  faults 
or  hypothese^that  "are  ^consistent  with  the  observed  behavior  of  the  faulted  instrument 
This  Task  is  performed  by  a  set  of  specialist  procedures  that  propose  a  possible  set  of 
hypotheses  to  explain  a  measurement  and  then  simulate  them  to  make  sure  that  t  ey  exp '  am 
the  output  voltage  and  ali  of  the  measurements  that  the  student  has  taken.  Hyp 
generation  can  be  used  to  suggest  possible  paths  to  explore  when  the  s  udent  has  un  out 
of  Ideas  for  what  could  be  wrong  with  the  circuit  or  when  he  wishes  to  understand  the  full 
rmplicabons  of  his  last  measurement,  it  Is  also  used  by  SOPHIE  to  determine  when  a 
measurement  Is  redundant. 


SOPHIE-II:  The  Augmented  SOPHIE  Lab 

Extensions  to  SOPHIE  Include:  (a)  a  troubleshooting  game  involving  two  teams  of  students 
and  (b)  the  development  of  an  articulate  expert  debugger/explainer  The  simple je™"™ 
learning  environment  has  also  been  augmented  by  the  development  of  ^rne-onented  CA! 
lesson  material,  used  to  prepare  the  student  for  the  aboratory  mteract.on  (Brown 
Rubinstein  &  Burton,  1976).  The  articulate  expert  not  only  locates  student-inserted  fa 
in  a  given  instrument  but  can  articulate  exactly  the  deductions  that  led  to  its  discovery,  as 
well  as  the  more  global  strategies  that  guide  the  trouble-shooting  scenario. 

Experience  with  SOPHIE  indicates  that  its  major  weakness  is  an  inability  to  follow  up  on 
student  errors.  Since  SOPHiE  Is  to  be  reactive  to  the  student,  it  wiil  not  take  the  initiative 
explore  a  student's  understanding  or  suggest  approaches  that  he  does  not 
However,  the  competitive  environment  of  the  troubleshooting  game,  In  which  partners  si  a  e 
problem  ar.d  work  It  out  together,  was  found  to  be  an  effective  means  of  exer  smg  t 
student's  knowledge  of  the  operation  of  the  instrument  be.ng  debugge^  Fma  y  an 
experiment  involving  a  mlnlcourse-and  exposure  to  the  frame-based  texts,  the  expert  a 
the  original  SOPHiE  Lab-Indicated  that  long-term  use  of  the  system  is  more  ®ff®ct,ve  than 
single,  concentrated  exposure  to  the  material  (Brown,  Rubinstein,  &  Burton,  1976J. 
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Summary 

The  goal  of  the  SOPHIE  project  was  to  create  a  learning  environment  in  which  the 
student  would  be  challenged  to  explore  ideas  on  his  own  and  to  create  conjectures  or 
hypotheses  about  a  problem-solving  situation.  The  student  receives  detailed  feedback  as  to 
the  logical  validity  of  his  proposed  solutions.  In  cases  where  the  student's  ideas  have 
logical  flaws,  SOPHIE  can  create  relevant  counterexamples  and  critiques.  The  SOPHIF 
system  combines  domain-specific  knowledge  and  powerful  domain-independent  Infernnr."' 
mechanisms  to  answer  questions  that  even  human  tutors  might  find  it  extremely  difficult  t<> 
answer. 


References 
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C4.  WEST 

Development  of  the  first  computer  coach  was  undertaken  by  Richard  Burtpn  and  John 
Seely  Brown  at  Bolt,  Beranek  and  Newman,  Inc.,  for  the  children's  board  game  called  How  the 
West  Was  Won.  The  term  "coach"  describes  a  computer-based  learning  environment  where 
the  student  is  involved  In  an  activity,  like  playing  a  computer  game,  and  the  instructional 
program  operates  by  "looking  over  his  shoulder"  during  the  game  and  occasionally  offering 
criticisms  or  suggestions  for  improvement  (Goldstein,  1977).  This  research  focused  on 
identifying:  (a)  diagnostic  strategies  required  to  infer  a  student's  misunderstandings  from  his 
observed  behavior  and  (b)  various  explicit  tutoring  strategies  for  directing  the  tutor  to  say  the 
right  thing  at  the  right  time  (Burton  &  Brown,  1976,  and  Burton  &  Brown,  1979).  The  intention 
of  this  work  was  to  use  these  strategies  to  control  the  Interaction  so  that  the  instructional 
program  took  every  possible  opportunity  to  offer  help  to  the  student  without  interrupting  so 
often  as  to  become  a  nuisance  and  destroy  the  student's  fun  at  the  game.  By  guiding  a 
student's  learning  through  discovery,  computer-based  coaching  systems  hold  the  promise  of 
enhancing  the  educational  value  of  the  Increasingly  popular  computer-gaming  environments. 


Philosophy  of  the  Instructional  Coach 

The  pedagogical  ideas  underlying  much  of  computer  coaching  research  in  WEST  can  be 
characterized  as  guided  discovery  learning.  It  assumes  that  the  student  constructs  his 
understanding  of  a  situation  or  a  task  based  on  his  prior  knowledge.  According  to  this  theory, 
the  notion  of  misconception  or  bug  plays  a  central  role  in  the  construction  process.  Ideally,  a 
bug  In  the  student's  knowledge  will  cause  an  erroneous  result  In  his  behavior,  which  the 
student  will  notice.  If  the  student  has  enough  Information  to  determine  what  caused  the 
error  and  can  then  correct  it,  the  bug  is  referred  to  as  constructive.  The  role  of  a  tutor  in  an 
Informal  environment  Is  to  give  the  student  extra  information  In  situations  that  would 
otherwise  be  confusing  to  him,  so  that  he  can  determine  what  caused  his  error  and  can 
transform  nonconstructive  bugs  into  constructive  ones  (see  Fischer,  Brown,  &  Burton,  1978 
for  further  discussion). 

However,  an  Important  constraint  on  the  coach  is  that  It  should  not  interrupt  the 
student  too  often.  If  the  coach  Immediately  points  out  the  student's  errors,  there  is  a 
danger  that  the  student  will  never  develop  the  necessary  skills  for  examining  his  own 
behavior  and  looking  for  the  causes  of  his  mistakes  himself.  The  tutor  must  be  perceptive 
enough  to  make  relevant  comments,  but  not  be  too  Intrusive,  destroying  the  fun  of  the  game. 
The  research  on  the  WEST  system  examined  a  wide  variety  of  tutorial  strategies  that  must 
be  Included  to  create  a  successful  coaching  system. 


How  the  West  Was  Won 

How  the  West  Was  Won  was  originally  a  computer  board  game  designed  by  Bonnie 
Anderson  of  the  Elementary  Mathematics  Project  at  the  PLATO  computer-based  education 
system  at  the  University  of  Illinois  (Dugdale  &  Kibbey,  1977).  The  purpose  of  this  original 
(nontutorlal)  program  was  to  give  elementary-school  students  drill  and  practice  in  arithmetic. 
The  game  resembles  the  popular  Chutes  and  Ladders  board  game  and,  briefly,  goes 
something  like  this:  At  each  turn  a  player  receives  three  numbers  (from  spinners)  with  which 
he  constructs  an  arithmetic  expression  using  the  operations  of  addition,  subtraction, 


C4 


WEST 


27 


multiplication,  and  division.  The  numeric  value  of  the  completed  expression  Is  the  number  of 
spaces  the  player  cm  move,  the  object  of  the  game  being  to  get  to  the  end  first. 

However,  the  strategy  of  combining  the  three  numbers  to  make  the  biggest  valued 
expression  Is  not  always  the  best  strategy,  because  there  are  several  special  features  on 
the  game  board.  Towns  occur  every  ten  spaces  and  if  a  player  lands  on  one,  he  skips  ahead 
to  the  next  town.  There  are  also  shortcuts,  and  If  he  lands  on  the  beginning  of  one  a  player 
advances  to  the  other  end  of  the  shortcut.  Finally,  If  the  player  lands  on  the  space  that  his 
opponent  Is  occupying,  the  opponent  Is  bumped  back  two  towns.  The  spinner  values  In  WEST 
are  small,  so  these  special  moves  are  encouraged  (l.e.,  landing  on  towns  or  shortcuts  **'  on 
your  opponent). 


Diagnostic  Modeling 

There  are  two  major  related  problems  that  must  be  solved  by  the  computer  coach. 
They  are  (a)  when  to  interrupt  the  student's  problem-solving  activity,  and  (b)  what  to  say 
once  It  has  been  Intarrupted.  In  general,  solutions  to  these  problems  require  both  techniques 
for  determining  what  the  student  knows  (piocedures  for  constructing  a  diagnostic  model )  and 
explicit  tutoring  prlnclp'es  about  Interrupting  and  advising.  These,  In  turn,  require  theories 
about  how  a  student  forms  abstractions,  how  he  learns,  and  when  he  Is  apt  to  be  most 
receptive  to  advice.  Unfortunately,  few,  If  any,  existing  psychological  theories  are  precise 
enough  to  suggest  anything  more  than  caution. 

Since  the  student  is  primarily  engaged  In  a  gaming  or  problem-solving  activity,  diagnosis 
of  his  strengths  and  weaknesses  must  be  unobtrusive  to  his  main  activity.  This  objective 
means  that  the  diagnostic  component  cannot  use  pre-stored  tests  or  pose  a  lot  of  diagnostic 
questions  to  the  student.  Instead,  the  computer  coach  must  restrict  itself  mainly  to  inferring 
a  student's  shortcomings  from  what  he  does  in  the  context  of  playing  the  game  or  solving  the 
problem.  This  objective  can  create  a  difficult  problem— just  because  a  student  does  not  use 
a  certain  skill  while  playing  a  game  does  not  mean  that  he  does  not  know  that  skill.  Although 
this  point  seems  quite  obvious,  It  poses  a  serious  diagnostic  problem:  The  absence  of  a 
potential  skill  carries  diagnostic  value  If  and  only  If  an  expert  In  an  equivalent  situation  would 
have  used  that  skill.  Hence,  apart  from  his  outright  errors,  the  main  window  a  computer- 
based  coach  has  on  a  student's  misconceptions  Is  through  a  differential  modeling  technique 
that  compares  what  the  student  Is  doing  with  what  the  expert  would  be  doing  In  his  place. 
This  difference  provide.!  hypotheses  about  what  the  student  does  not  know  or  has  not  yet 
mastered.  (See  the  related  discussion  of  overlay  models  in  Article  C5.) 

Constructing  the  differential  model  requires  that  two  tasks  be  performed  by  the  coach, 
using  the  computer  Expert  (the  subprogram  that  Is  expert  at  playing  the  game  WEST).  The 
first  task  of  the  coach  Is  to  evaluate  the  student's  current  move  with  respect  to  the  set  of 
possible  alternative  moves  that  an  Expert  might  have  made  in  the  exact  same 
circumstances.  The  second  task  Is  to  determine  what  underlying  skills  were  used  to  select 
and  compose  the  student's  move  and  each  of  the  "better”  moves  of  the  Expert.  To 
accomplish  the  evaluative  task,  the  Expert  need  only  use  the  results  of  Its  knowledge  and 
reasoning  strategies,  available  as  better  moves.  However,  for  the  second  task,  the  coach 
has  to  consider  the  "pieces"  of  knowledge  Involved  In  move  selection  and  In  the  generation 
of  better  moves,  since  the  absence  of  one  of  these  pieces  of  knowledge  might  explain  why 
the  student  failed  to  make  a  better  move. 
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Tutoring  by  Issue  and  Example  —  A  General  Paradigm 

One  of  the  top-levei  goals  driving  the  coach  Is  the  objective  that  its  comments  be  both 
relevant  to  the  situation  and  memorable  to  the  student.  The  Issues  and  Examples  tutoring 
strategy  provides  a  framework  for  meeting  these  two  constraints.  Issues  are  concepts  used 
in  the  diagnostic  process  to  identify,  at  any  particular  moment,  what  Is  relevant,  t  xamples 
provMe  concrete  Instances  of  these  abstract  concepts.  Providing  both  the  description  of  a 
generic  issue  (a  concept  used  to  select  a  strategy)  and  a  concrete  example  of  its  use 
increases  the  chance  that  the  student  will  integrate  this  piece  of  tutorial  commentary  into 
his  knowledge.  In  the  Issues  and  Examples  paradigm,  the  issues  embody  the  important 
concepts  underlying  a  student's  behavior.  They  define  the  space  of  concepts  that  the 
Coach  can  address-the  facets  of  the  student's  behavior  that  are  monitored  by  the  Coach. 

In  WEST,  there  are  three  levels  of  issues  on  which  a  Coach  can  focus:  At  the  low> 
level  are  the  basic  mathematical  skills  that  the  student  is  practicing  (the  use  o> 
parentheses,  the  use  of  the  various  arithmetic  operations,  and  the  *orm  or  pattern  of  the 
student's  move  as  an  arithmentic  expression).  The  second  levei  of  Issues  concerns  the 
skills  needed  to  piay  WEST  (ilke  the  speciai  moves:  bump,  town,  and  shortcut)  and  the 
development  of  a  strategy  for  choosing  moves.  At  the  third  ievei  are  the  general  skills  of 
game  playing  (like  watching  your  opponent  to  (earn  from  his  moves),  which  are  not  addressed 
by  the  WEST  program. 

Each  of  the  issues  is  represented  in  two  parts,  a  recognizer  and  an  evaluator.  The  issue 
recognizer  is  data-directed;  it  watches  the  student's  behavior  for  evidence  that  he  does  or 
does  not  use  a  particular  concept  or  skiii.  The  recognizers  are  used  to  construct  a  model  of 
the  student's  knowledge.  The  issue  evaluators  are  goal-directed;  they  interpret  this  model 
to  determine  the  student's  weaknesses.  The  issue  recognizers  of  WEST  ore  fairly 
straightforward  but  are,  nevertheless,  more  complex  than  simpie  pattern  matchers, 
example,  the  recognizer  for  the  PARENTHESIS  Issue  must  determine  not  only  whether  or  not 
parentheses  ore  present  In  the  student's  expression,  but  also  whether  they  were  necessary 
for  his  move,  or  for  an  optimal  move. 

Figure  1  is  a  diagram  of  the  modeiing/tutoriai  process  underlying  the  issues  and 
Examples  paradigm.  Figure  la  presents  the  process  of  constructing  a  model  of  the  student's 
behavior,  it  is  important  to  observe  that  without  the  Expert  it  is  impossible  to  determine 
whether  the  student  is  weak  in  some  skiii  or  whether  the  skiii  has  not  been  used  because 
the  need  for  it  has  arisen  infrequently  in  the  student's  experience. 
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Figure  1.  Diagram  of  the  Modeling/Coaching  Process 
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The  Coaching  Process 

i  igUre  lb  presents  the  top  level  of  the  coaching  process.  When  the  student  makes  a 
less  than  optimal  move  (as  determined  by  comparing  his  move  with  that  of  the  Expert),  the 
Coach  uses  the  evaluation  component  of  each  irsue  to  create  a  list  of  issues  on  which  it  has 
assessed  that  the  student  Is  weak.  From  the  Expert’s  list  of  better  moves,  the  Coach 
invokes  the  Issue  recognizers,  to  determine  which  issues  are  illustrated  by  these  better 
moves.  From  these  two  lists  of  Issues,  the  Coach  selects  an  issue  and  the  move  that 
Illustrates  it  (i.e.,  creates  an  example  of  It)  and  decides,  on  the  basis  of  tutoring  principles, 
whether  or  rot  to  interrupt.  If  the  two  ilsts  have  no  issues  in  common,  the  reason  for  the 
student's  problem  lies  outside  the  collection  of  Issues,  and  the  Coach  says  nothing. 

if  the  Coach  decides  to  Interrupt,  the  selected  Issue  and  Example  are  then  passed  to 
the  explanation  generators,  which  produce  the  feedback  to  the  student.  Currently,  the 
explanations  are  stored  In  a  procedures,  called  Speakers,  attached  to  each  Issue.  Each 
Speaker  is  responsible  for  presenting  a  few  iines  of  text  explaining  its  Issue.  (See  also  the 
related  discussion  of  computer  coaching  In  Article  C5  on  WUMPUS). 


Tutoring  Principles 

General  tutoring  principles  dictate  that,  at  time,,  even  when  relevant  Issues  and 
Examples  have  been  identified,  it  may  be  inappropriate  to  interrupt.  For  example,  what  if 
there  are  two  competing  issues,  both  applicable  to  a  certain  situation?  Which  one  should  be 
picked?  The  Issues  in  WEST  are  sufficiently  independent  that  there  is  little  need  to 
consider  their  prerequisite  structure,  for  example,  whether  the  use  of  parentheses  should  be 
tutored  before  division  (but  see  the  description  of  the  syllabus  in  WUMPUS,  Article  C5) 
instead,  additional  tutoring  principles  must  be  Invoked  to  decide  which  one  of  the  set  of 
applicable  Issues  should  be  used. 

in  WEST,  experiments  have  been  conducted  using  two  alternate  principles  to  guide  this 
decision.  The  first  Is  the  Focus  Strategy,  which  ensures  that,  everything  else  being  equal, 
the  Issue  most  recently  discussed  is  chosen--the  Coach  will  tend  to  concentrate  on  a 
particular  Issue  until  evidence  Is  present  to  indicate  that  it  is  mastered.  The  alternative 
principle  Is  the  Breadth  Strategy,  where  issues  that  have  not  recently  been  discussed  tend 
to  be  selected.  This  strategy  minimizes  a  student's  boredom  and  insures  breadth  of  concept 
coverage. 

The  rest  of  WEST'S  strategies  for  deciding  whether  to  raise  an  Issue  and  what  to  say 
can  be  placed  In  the  four  categories  listed  below,  with  example  rules  of  each: 

1.  Coaching  Philosophy.  Tutoring  principles  can  enhance  a  student’s  likelihood 
of  remembering  what  Is  said.  For  example,  "When  illustrating  an  issue,  use  an 
Example  (an  alternative  move)  only  when  the  result  or  outcome  of  that  move 
is  dramatically  superior  to  the  move  made  by  the  student." 

2.  Maintaining  interest  In  the  Gsme.  The  Coach  should  not  destroy  the 
student's  Inherent  Interest  In  the  game  by  Interrupting  too  often.  For 
example,  "Never  tutor  on  two  consecutive  moves,"  or  "If  the  student  makes 
an  exceptional  move,  Identify  why  It  Is  good  and  congratulate  him." 
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3.  Increasing  Chances  of  Learning.  Four  levels  of  hints  are  provided  by  the 
WEST  tutor,  at  the  student's  request:  (a)  isolate  a  weakness  and  directly 
address  that  weakness,  (b)  delineate  the  space  of  possible  moves  at  this 
point  in  the  game,  (c)  select  the  optimal  move  and  tell  why  it  Is  optimal,  and 
(d)  describe  how  to  make  th'  optimal  move. 

4.  Environmental  Considerations.  The  Coach  should  consider  the  game-plaving 
environment.  For  example,  "If  the  student  makes  a  possibly  careless  error, 
one  for  which  there  Is  evidence  that  he  knows  better,  be  forgiving." 


Noise  In  the  Model 

When  the  student  does  not  make  an  optimal  move,  the  program  knows  only  that  at  least 
one  of  the  Issues  required  for  that  move  was  not  employed  by  the  student.  Which  of  these 
Issues  blocked  the  student  from  making  the  move  Is  not  known.  In  practice,  blame  Is 
apportioned  more  or  less  equally  among  all  of  the  Issues  required  for  a  missed  better  move. 
One  effect  of  this  apportionment  is  the  Introduction  of  noise  Into  the  model,  that  is,  blame  will 
almost  certainly  be  apportioned  to  Issues  that  are,  In  fact,  understood.  Also,  since  the 
system  does  net  account  for  the  entire  process  that  a  person  uses  to  derive  a  move,  the  set 
of  Issues  is,  by  definition,  Incomplete.  This  Is  the  second  source  of  noise  In  the  differential 
model.  A  third  source  of  noise  in  the  model  Is  the  difficulty  of  modeling  certain  human  factors 
such  as  boredom  or  fatigue  that  cause  Inconsistent  behaviors.  For  example,  students  <re 
seldom  completely  consistent.  They  often  forget  to  use  techniques  that  they  know,  or  get 
tired  and  accept  a  move  that  Is  easy  to  generate  but  which  does  not  reflect  their 
knowledge. 

Another  source  of  noise  Is  Inherent  In  the  process  of  learning.  As  the  student  plays  the 
game,  he  acquires  new  skills.  The  student  model,  which  has  been  accumulating  during  the 
course  of  his  play,  will  not  be  up  to  date,  that  is,  It  will  still  show  the  newly  learned  issues  as 
"weaknesses."  Ideally,  the  "old  pieces"  of  the  model  should  decay  with  time.  Unfortunately, 
the  costs  Involved  in  this  computation  are  prohibitive.  To  avoid  this  particular  failing  of  the 
model,  the  WEST  Coach  removes  from  consideration  any  Issues  that  the  student  has  usea 
recently  (In  the  last  three  moves),  assuming  that  they  are  now  part  of  his  knowledge. 

To  combat  the  noise  that  arises  In  the  model,  the  Evaluator  for  each  Issue  t«nds  to 
assume  that  the  student  has  mastery  of  the  Issue.  Some  coaching  opportunities  f.my  be 
missed,  but  eventually,  If  the  student  has  a  problem  addressed  by  an  Issue,  a  pattern  will 
emerge. 


Experiences  with  West 

WEST  has  been  used  In  elementary  school  classrooms.  In  a  controlled  experiment,  the 
coached  i/erslon  of  WEST  was  compared  to  an  uncoached  version.  The  coached  students 
shower  a  considerably  greater  variety  of  patterns,  Indicating  that  they  hod  acquired  many  of 
the  more  subtle  patterns  and  had  not  fallen  permanently  Into  "ruts"  that  prevented  them  from 
seeing  when  such  moves  were  Important.  Moreover,  and  perhaps  most  Important  of  all,  the 
students  In  the  coached  group  enjoyed  playing  the  game  considerably  more  than  the 
uncoached  group  (Goldstein,  1979). 
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C5.  WUMPUS 

This  artlc'e  describes  a  computer  coach  for  WUMPUS,  a  computer  game  In  which  the 
player  must  track  down  and  slay  the  vicious  Wumpus  while  avoiding  pitfalls  that  result  in 
certain,  If  fictional,  death  (Yob,  1975).  The  coach  descrioed  here  Is  WUSOR-II,  one  of  three 
"generations"  of  computer  coaches  for  WUMPUS  developed  by  Ira  Goldstein  and  Brian  Carr  at 
MIT  (Carr  &  Goldstein,  1977).  (For  discussions  of  WUSOR-I  and  -III,  see  Stansfield,  C  arr,  & 
Goldstein,  1 976,  and  Goldstein,  1 979,  respectively.)  To  be  a  skilled  Wumpus-hunter  one  must 
know  about  logic,  probability,  decision  theory,  and  geometry.  A  deficit  in  one’s  knowledge 
may  result  In  being  eaten  by  the  Wumpus  or  falling  through  the  center  of  the  earth.  In 
keeping  with  the  philosophy  of  computer  coaching,  students  are  highly  motivated  to  learn 

these  fundamental  skills. 

The  design  of  the  WUSOh  II  system  Involves  the  interactions  of  the  specialist  programs 
shown  In  Figure  1.  There  are  four  modules:  the  Expert,  the  Psychologist,  the  Student  Model, 
and  the  Tutor.  The  Expert  Informs  the  Psychologist  of  two  facts:  (a)  if  the  player’s  move  is 
nonoptlmal  and  (b)  which  skills  are  needed  for  him  to  discover  better  alternatives.  The 
Psychologist  employs  this  comparison  to  formulate  hypotheses  about  which  domain-speci  ic 
skills  are  known  to  the  student.  These  hypotheses  are  recorded  In  the  Student  Model  which 
represents  the  student’s  knowledge  as  a  subset  of  the  Expert’s  skills-an  overlay  model  ( see 
Overview  B  and  Carr  &  Goldstein,  1977).  The  Tutor  uses  the  student  model  to  guide  its 
Interactions  with  the  player.  Basically,  It  chooses  to  discuss  skills  not  yet  exhibited  by  he 
player  in  situations  where  their  use  would  result  In  better  moves.  Goldstein  (1977  provides 
a  more  detailed  discussion  of  the  structure  and  function  of  these  coaching  modules.  Also 
see  the  discussion  of  the  WEST  computer  coach  in  Article  C4.) 

The  central  box  of  Figure  1  contains  a  representation  for  the  problem-solving  skills  of 
the  domain  being  tutored.  It  Is,  In  essence,  a  formal  representation  of  the  syllabus  -  I  he 
Expert  Is  derived  from  the  skills  represented  therein,  as  Is  the  structure  of  the  student 
model.  The  Psychologist  derives  expectations  from  this  knowledge  regarding  which  skills  the 
student  can  be  expected  to  acquire  next,  based  on  a  model  of  the  relative  difficulty  of  items 
In  the  syllabus.  The  Tutor  derives  relationships  between  skills  such  as  analogies  and 
refinements,  which  can  be  employed  to  Improve  Its  explanations  of  new  skills  (see  Goldstein, 

1979). 


Theoretical  Goals:  Toward  a  Theory  of  Coaching 

The  approach  to  the  design  of  computer  coaches  In  WUSOR-II  Is  to  construct  rule-based 
representation  (see  Article  Representation^)  for  (a)  the  skills  needed  by  the  Expert  to  play 
the  game,  (b)  the  modeling  criteria  used  by  the  Psychologist,  and  (c)  the  alternative  tutoring 
strategies  used  by  the  Tutor.  Each  Is  expanded  below: 


PSYCHOLOGIST 


i 

i 

i 


MOVE  complexity  update 

ANALYSIS  DATA  MODEL 


Fig.  1 .  Simplified  block  diagram  of  a  computer  coach. 
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The  Expert  uses  rules  that  embody  the  knowledge  or  skills  required  to  play  the  game 
to  analyze  the  player's  behavior.  The  virtue  of  a  rule-based  representation  of  expertise  is 
that  Its  modularity  both  allows  tutoring  to  focus  concisely  on  the  discussion  of  specific  skills 
and  permits  modeling  to  take  the  form  of  hypotheses  regarding  which  rules  are  known  by  the 
player. 

The  Psychologist  uses  rules  of  evidence  to  make  reasonable  hypotheses  about  which  of 
the  Expert's  skills  the  player  possesses.  Typical  rules  of  evidence  are: 

Increase  the  estimate  that  a  player  possesses  a  skill  If  the  player  explicitly 
claims  acquaintance  with  the  skill,  and  decrease  the  reliability  If  the  player 
expresses  unfamlllarlty. 

Increase  the  estimate  that  a  player  possesses  a  skill  If  the  skill  Is  manifest  In  the 
player's  behavior,  and  decrease  the  estimate  If  the  skill  Is  not  manifest  in  a 
situation  where  the  Expert  believes  It  to  be  appropriate;  hence,  Implicit  as  well 
as  overt  evidence  plays  a  role. 

Decrease  the  estimate  that  a  player  possesses  a  skill  If  there  Is  a  long  interval 
since  the  last  confirmation  was  obtained  (thereby  modeling  the  tendency  for  a 
skill  to  decay  with  little  use). 

The  Tutor  uses  explanation  rules  to  select  the  appropriate  topic  to  discuss  with  the 
player  and  to  choose  the  form  of  the  explanation.  These  rules  include: 

Rules  of  simplification  that  take  a  complex  statement  and  reduce  it  to  a  simpler 
assertion.  Simplification  rules  are  essential  If  the  player  is  not  to  be 
overwhelmed  by  the  Tutor's  explanations. 

Rules  of  rhetoric  that  codify  alternative  explanation  strategies.  The  two  extremes 
are  e.  ination  in  terms  of  a  general  rule  and  explanation  in  terms  of  a  concrete 
instance. 


The  WUMPUS  Expert 

in  WUMPUS,  the  player  Is  initially  placed  somewhere  In  a  randomly  connected  warren  of 
caves  and  told  the  neighbors  of  his  current  location.  His  goal  is  to  locate  the  horrid  Wumpus 
and  slay  it  with  an  arrow.  Each  move  to  a  neighboring  cave  yields  Information  regarding  that 
cave's  neighbors.  The  difficulty  In  choosing  a  move  arises  from  the  existence  of  dangers  in 
the  warren-bats,  pits,  and  the  Wumpus  itself.  If  the  player  moves  into  the  Wumpus's  lair,  he 
is  eaten.  If  he  walks  into  a  pit,  he  fails  to  his  death.  Bats  pick  the  player  up  and  randomly 
drop  him  elsewhere  In  the  warren. 

The  player  can  minimize  risk  and  locate  the  Wumpus  by  making  the  proper  logistic  and 
probabilistic  Inferences  from  warnings  that  he  Is  given.  These  warnings  are  provided 
whenever  the  player  Is  In  the  vicinity  of  a  danger.  The  Wumpus  can  be  smelled  within  one  or 
two  caves.  The  squeak  of  bats  can  be  heard  one  cave  away  and  the  breeze  of  a  pit  felt 
one  cave  away.  The  game  Is  won  by  shooting  an  arrow  Into  the  Wumpus's  lair,  if  the  player 
exhausts  his  set  of  five  arrows  without  hitting  the  creature,  the  game  Is  lost. 
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The  Wumpus  Expert  uses  a  rule-based  representation,  consisting  of  approximately  20 
rules,  to  infer  the  risk  of  visiting  new  caves.  Five  of  these  rules  are  shown  beiow: 

LI.  Positive  Evidence  Rule.  A  warning  in  a  cave  impiies  that  a  danger  exists  in  a 
neighbor. 

L2.  Negative  Evidence  Rule.  The  absence  of  a  warning  impiies  that  no  danger 
exists  in  any  neighbors. 

L3.  Elimination  Rule.  If  a  cave  has  a  warning  and  all  but  one  of  its  neighbors  are 
known  to  be  safe,  then  the  danger  Is  in  the  remaining  neighbor. 

PI.  Equal  Likelihood  Rule,  in  the  absence  of  other  knowledge,  all  of  the 
neighbors  of  a  cave  with  a  warning  are  equally  likely  to  contain  a  danger. 

P2.  Double  Evidence  Rule.  Multiple  warnings  increase  the  iikeiihood  that  a  given 
cave  contains  a  danger. 


A  Sample  Protocol  with  the  WUSOR-li  Computer  Coach 

A  transcript  of  an  interaction  with  the  WUSOR-II  coach  Is  illustrated  beiow.  The 
player's  responses  are  preceded  by  a  **. 

Heiio,  Timmy.  There  are  16  caves,  2  bats,  2  pits,  and  1  Wumpus.  You 
are  now  at  cave  15  with  neighbors  A,  14  and  0.  Brrr!  There  Is  a  draft.  You 
are  near  a  pit.  What  a  stench!  The  Wumpus  is  near.  What  now? 

**  4 


You  are  now  at  cave  4  with  neighbors  16,  14  and  2.  Brrr!  Squeak! 

A  bat  is  near.  What  now? 

The  goal  of  the  Coach  is  to  tutor  a  beginner  in  the  relevant  logical,  probabilistic,  and 
strategic  knowledge  needed  to  play  the  game.  For  example,  the  Expert  informs  the  Tutor 
that  cave  14  should  be  treated  as  more  dangerous  than  0  or  2  since  there  is  multiple 
evidence  (from  the  drafts  in  1 5  and  4)  that  14  contains  a  pit.  If  the  player  now  moved  to 
cave  14,  a  coaching  situation  might  occur  as  follows: 


**  14 


Timmy,  it  isn't  necessary  to  take  such  large  risks  with  pits.  One  of 
cave  2  and  14  contains  a  pit.  Likewise  one  of  cave  0  and  14  contains  a  pit. 
This  is  multiple  evidence  of  a  pit  In  cave  14  which  makes  it  quite  likely 
that  cave  14  contains  a  pit.  it  is  less  likely  that  cave  0  contains  a  pit. 
Hence,  we  might  want  to  explore  cave  0  Instead.  Do  you  want  to  take  back 
your  move? 


/ 
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Although  It  Is  not  apparent  from  these  simple  remarks,  every  module  ol  thr  Coach 
contributed  to  the  dialogue.  These  contributions  are  summarized  below: 

The  Expert  analyzes  aii  possible  moves,  using  its  set  of  skills.  The  outcome  of  its 
analysis  Is  a  ranking  of  possible  moves  with  an  attached  list  that  associates  the  sk  lls  that 
would  be  needed  to  make  each  move.  For  example,  using  the  five  skills  lister;  earlier,  the 
Expert  recognizes  that  cave  14  Is  the  most  dangerous  move  and  cave  0  is  the  safest  move. 

Essentially,  the  Expert  provides  the  following  proof  for  use  by  the  Psychologist  and 
Tutor  modules,  (The  proof  Is  given  here  in  Engiish  for  readability:  the  Expert's  actual 
analyses  are  in  the  programming  language  LiSP.) 

Lemma  1:  The  Wumpus  cannot  be  in  0,  2,  or  14  since  there  is  no  smell  In  4. 

(Application  of  the  Negative  Evidence  Rule,  L2,  for  2-cave  warning  of  Wumpus.) 

Lemma  2:  Caves  0  and  2  were  better  than  14  because  there  was  single 

evidence  that  caves  0  and  2  contained  a  pit,  but  double  evidence  for  cave  14. 

(Application  of  the  Double  Evidence  Ruie,  P2.) 

Lemma  3:  Cave  2  is  more  dangerous  than  cave  0,  since  2  contains  a  bat,  and  the 

bat  could  drop  you  In  a  fatal  cave.  (We  know  this  fact  because  the  squeak  in  4 

Implied  a  bat  in  14  or  2;  but  the  absence  of  a  squeak  In  15  implies  no  bat  in  14. 

Hence,  by  Elimination  Rule,  L3,  there  is  a  bat  in  2.) 

The  Psychologist,  after  seeing  Timmy  move  to  cave  14,  decreases  the  Student  Model 
weight  Indicating  familiarity  with  the  Double  Evidence  Rule,  P2,  since  the  Expert’s  proof 
indicates  that  this  heuristic  was  not  applied.  Table  1  is  the  Psychologist's  hypotheses 
regarding  which  skills  of  the  Expert  the  student  possesses. 


Table  1. 


A  Typical  Student  Model  Maintained  by  the  Coach 


RULES 

LI 

L2 

L3 

L4 

L5 


APPROPRIATE  USED 

5  5 

4  3 

4  2 

5  5 

4  1 


PER  CENT 

KNOWN 

100 

Yes 

75 

Yes 

50 

? 

100 

Yes 

25 

No 

Modeling  raises  many  issues.  One  subtlety  is  that  the  move  to  14  above  may  be 
evidence  of  a  more  elementary  limltatlon--a  failure  to  understand  the  logical  implications  of 
the  draft  warning— l.e.,  that  a  pit  is  In  a  neighboring  cave.  The  current  state  of  the  Student 
Model  is  used  by  the  Psychologist  to  determine,  in  the  event  of  a  nonoptimal  move,  which  skill 
Is  in  fact  missing.  The  Student  Model  Indicates  the  level  of  play  that  can  be  expected  from 
this  player-the  player  might  be  a  beginner  with  Incomplete  knowledge  of  the  basic  rules  of 
the  game,  a  novice  with  understanding  of  the  logical  skills,  an  amateur  with  knowledge  of  the 
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logical  and  the  more  elementary  probability  skills,  etc.  The  Psychologist  would  attribute  the 
student's  error  in  the  current  situation  to  unfamiliarity  with  a  skiil  at  his  current  level  of  play; 
in  this  case,  Timmy  is  a  player  who  has  mastered  the  logical  skills  and  is  learning  the  basic 
probability  heuristics.  Hence,  the  coach's  explanation  focused  on  explaining  the  double 
evidence  heuristic. 

The  Tutor  is  responsible  for  abridging  the  Coach's  response  to  the  player’s  move  to 
cave  14.  (The  complete  explanation  generated  by  the  Expert  were  the  three  lemmas  shown 
above.)  Such  pruning  is  imperative  if  the  Coach  Is  to  generate  comprehensible  advice. 
Hepce,  the  Tutor  prunes  the  complete  analysis  on  the  basis  of  simplification  rules  that  delete 
those  parts  of  the  argument  that  are  already  known  to  the  player  on  the  basis  of  the 
Student  Model  and  those  portions  that  are  too  complex.  Here,  the  coach  deleted  Lemma  1 , 
the  discussion  of  the  Wumpus  danger,  because  It  is  based  on  the  negative  evidence  skill 
that  the  Student  Model  attributes  to  the  player.  Lemma  2,  the  elimination  argument  for  bats, 
is  potentially  appropriate  to  discuss;  but  a  simplification  strategy  directs  the  Coach  to  focus 
on  a  single  skill.  Additional  information  will  be  given  by  the  Coach  if  requested  by  the  player. 


Conclusions 

The  novelty  of  this  research  Is  that  in  a  single  sys'em  there  is  significant  domain 
expertise,  a  broad  range  of  possible  interaction  strategies  available  to  the  tutor,  arid  a 
modeling  capability  for  the  student's  current  knowledge  state.  Informal  experience  with  over 
20  players  of  various  ages  has  shown  WUSOR-II  to  be  a  helpful  learning  aid,  as  judged  by 
interviews  with  the  players.  The  short-term  payoff  from  this  research  is  an  improved 
understanding  of  the  learning  and  teaching  processes.  The  long-term  payoff  is  the 
development  of  a  practical  educational  technology,  given  the  expected  decrease  in 
hardware  costs. 
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C6.  BUGGY 

BUGGY  is  a  program  that  can  accurately  determine  a  student's  misconceptions  (bugs) 
about  basic  arithmetic  skills.  The  system,  developed  by  John  Seely  Brown,  Richard  Burton 
and  Kathy  Larkin  at  Bolt,  Beranek  and  Newman,  Inc.,  provides  a  mechanism  for  explaining  why 
a  student  is  making  an  arithmetic  mistake,  as  opposed  to  simply  Identifying  the  mistake 
Having  a  detailed  model  of  a  student's  knowledge  that  Indicates  his  misconceptions  is 
Important  for  successful  tutoring. 

A  common  assumption  among  teachers  Is  that  students  do  not  follow  procedures  very 
well  and  that  erratic  behavior  Is  the  primary  cause  of  a  student's  inability  to  perform  each 
step  correctly.  Brown  b  Burton  (1978)  argue  that  students  are  remarkably  competent 
procedure  followers,  but  they  often  follow  the  wrong  procedures.  By  presenting  examples  of 
systematic  Incorrect  behavior,  BUGGY  allows  teachers  to  practice  diagnosing  the  underlying 
causes  of  a  student's  errors.  Using  BUGGY,  teachers  gain  experience  at  forming  hypotheses 
about  the  relationship  between  the  symptoms  of  a  bug  that  a  student  manifests  and  the 
underlying  misconception.  This  experience  helps  teachers  become  more  aware  of  methods  or 
strategies  available  for  diagnosing  their  student's  problems  properly. 


Manifesting  Bugs 

Experience  with  BUGGY  indicates  that  forming  a  model  of  what  is  wrong  with  a 
student's  method  of  performing  a  task  is  often  more  difficult  than  performing  the  task  itseif. 
Consider,  for  example,  the  following  addition  problems  and  their  (erroneous)  solutions.  They 
were  provided  by  a  student  with  a  "bug"  In  his  addition  procedure: 


41 

328 

989 

66 

216 

+  9 

+917 

+  52 

+887 

+  13 

56 

1345 

1141 

1053 

229 

Once  you  have  discovered  the  bug,  try  testing  your  hypothesis  by  simulating  the  buggy 
student--predict  his  results  on  the  following  two  test  problems: 

446  20 1 

+815  +399 

The  bug  Is  simple.  In  procedural  terms,  after  determining  the  carry,  the  student  forgets 
to  reset  the  "carry  register"  to  zero;  he  accumulates  the  amount  carried,  across  the 
columns.  For  example,  in  the  student's  second  problem  (328  +  91/  =  1345),  he  proceeds  as 
follows:  8  +  7  =  1  5  ,  so  he  writes  5  and  carries  1;  2  +  1  =  3  pius  the  1  carried  is  4;  finaliy, 
3  +  9  =  12  ,  but  the  1  carried  from  the  first  column  is  stili  there--it  has  not  been  reset--so 
adding  it  to  the  final  column  gives  13.  If  this  is  the  correct  bug,  then  the  answers  to  the 
test  problems  will  be  1361  and  700.  (This  bug  is  really  not  sc  unusual;  a  ciiiid  oftep  uses  his 
fingers  to  remember  the  carry  and  might  forget  to  bend  them  back  after  each  column.) 

The  model  built  by  BUGGY  Incorporates  both  correct  and  Incorrect  subprocedures  that 
simulate  the  student's  behavior  on  particular  problems  and  capture  what  parts  of  a  student’s 
sklli  are  correct  and  what  pans  are  incorrect.  BUGGY  represents  a  skid,  such  as  addition,  as 
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a  collection  of  subskiils,  for  example,  one  of  which  is  knowing  how  to  "carry"  a  digit  in  o  t  le 
next  column.  The  subprocedures  in  BUGGY  that  correspond  to  human  subskiils  are  linked  into 
a  procedural  net  (Sacerdoti,  1974),  which  Is  BUGGY'S  representation  of  the  entire  human  skill, 
if  all  the  sgbprocedures  in  BUGGY'S  procedural  net  for  addition  work  correctly,  then  BUGGY 
will  do  addition  problems  correctly.  On  the  other  hand,  replacing  correct  subprocedures  with 
ones  that  are  faulty  will  result  in  systematic  errors  of  the  kind  shown  above.  Brown  and 
Burton  call  a  procedural  network  with  one  or  more  faulty  subprocedures  a  diagnostic  model 
because  It  is  a  way  of  representing  systematic  errors.  The  model  has  been  used  in  two 
ways.  First,  it  can  diagnose  a  student's  errors  and  pinpoint  the  bug(s)  in  the  student  s  skill. 
Second,  it  can  help  to  train  a  teacher  to  diagnose  student  errors  by  "playing  the  part  of  a 
student  with  one  or  more  buggy  subskills. 

When  BUGGY  is  to  diagnose  a  student's  errors,  its  task  is  to  modify  the  correct 
proceduiai  network  of,  say,  subtraction  until  It  accounts  for  all  of  the  student's  answers, 
both  right  and  wrong.  This  modification  Is  done  by  systematically  replacing  correc, 
subprocedures  with  incorrect  variants  until  a  consistent  diagnostic  model  is  found.  There  are 
currently  70  primitive  faulty  subprocedures  for  subtraction.  These  are  explored 
exhaustively  while  attempting  to  determine  a  consistent  diagnostic  model.  If  a  single  varian 
or  bug  is  insufficient  to  explain  a  student's  behavior,  then  combinations  of  two  bugs  are 
tried  (Because  of  the  overwhelming  number  of  combinations  of  three  or  more  bugs,  these 
are  not  used  to  form  diagnostic  models.)  in  this  manner,  £30  "bugs"  have  been  identified, 
each  with  a  bug  description.  Interactions  among  bugs  and  the  ramifications  of  a  uggy 
subpror.edure's  being  called  by  several  hlgh-order  procedures  constitute  major  challenges 
for  designing  efficient  simulations  of  multiple  bugs.  Note  also  that  this  technique  requires  a 
large  amount  of  compute  time  and  Is  amenable  only  to  domains  where  bugs  can  be  explicated 
in  a  more  or  less  complete  way. 

The  procedural  network  for  subtraction  (and  Its  bugs)  was  tested  In  an  experiment 
using  the  results  of  a  test  administered  to  1,325  students  in  the  fourth,  fifth,  and  sixth 
grades.  Using  the  BUGGY  model,  It  was  found  that  about  40%  of  the  students  demonstrate 
consistently  buggy  behavior.  That  is,  BUGGY  was  able  to  construct  a  diagnostic  model  for 
student's  errors  demonstrating  that  they  were  using  nonrandom,  though  incorrect  procedures. 


Using  Buggy  to  Train  Teachers 

BUGGY  can  also  be  used  to  train  teachers  to  diagnose  student  errors.  This  facility  is 
illustrated  by  a  protocol  excerpted  from  Brown  (1977)  in  which  BUGGY  presents  a  team  of 
teachers  with  incorrectly  solved  math  problems  and  supervises  their  diagnosis  of  the  bug. 


Protocol  of  a  Team  Using  BUGGY 

Welcome  to  Bugiand.  I  have  chosen  a  bug. 

Here  Is  an  example  of  the  bug. 

17 

+6 

13 
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The  system  presents  an  example  in  which  the  student  didn't  get  the  right  answer, 

Now  you  can  give  me  problems  to  determine  the  bug. 

**18  43 

**  +6  +79 

16  23 

The  teachers  generate  problems,  and  BUGGY  solves  them  incorrectly,  following  a  bv  gy 
procedure.  The  team  is  trying  to  diagnose  the  bug  in  BUGGY’S  addition  procedi  .j  by 
presenting  BUGGY  with  problems  to  solve.  However,  they  are  using  problems  similar  to  the 
original  one,  which  is  a  poor  strategy  because  it  may  leave  other  symptoms  of  the  bug 
undiscovered. 

**  GOT  THE  BUG! 

The  team  thinks  that  they  know  what  the  problem  is.  Omitted  here  for  space,  but  shown 
below,  BUGGY  asks  them  to  formulate  the  bug  in  English.  This  step  has  the  effect  of 
making  the  team  verbalize  the  bug  clearly.  It  is  a  difficult  task,  and  (he  results  ate  often 
ambiguous.  BUGGY  does  not  "read"  the  description  of  the  bug,  because  it  does  not  have 
the  natural  language  abilities  to  do  so.  Instead,  it  generates  a  series  of  problems  for  the 
team  to  solve.  The  team  must  solve  them  incorrectly,  using  the  bug  that  they  think  they 
have  identified,  to  convince  BUGGY  that  they  have  in  fact  discovered  the  bug. 

Here  are  some  problems  to  test  your  theories  about  the  bug. 

What  Is: 

21 

+39 

**  61 

That's  not  the  bug  I  have.  Using  my  bug: 

21 

+39 

16 

Try  giving  me  some  more  examples... 

The  team  fails  BUGGY’s  test  by  being  unable  to  solve  the  given  problem.  T hat  is,  their  bug 
is  not  the  same  one  that  BUGGY  is  using.  If  they  had  solved  b  problems  " correctly"  (that 
is,  incorrectly  because  of  a  bug),  then  BUGGY  would  credit  (hem  with  a  correct  diagnosis. 
Now,  however  they  must  set  more  problems  for  BUGGY  to  solve,  in  an  attempt  to  deduce 
the  bug. 

**  61  99  68 

**  +1707  +99  +9 

21  36  23 
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**  GOT  THE  BUGI 

Please  describe  the  bug  you  think  I  have: 

**  THE  STUDENT  IS  ADDING  ALL  THE  DIGITS  TOGETHER. 

Here  are  some  problems  to  test  your  theory  about  the  bug. 
What  are. 


33 

1091 

8 

28 

90 

+99 

+60 

+34 

+  70 

+6 

**  24 

17 

16 

17 

16 

Very  good.  My  description  of  the  bug  is: 

The  student  always  sums  up  all  the  digits,  with  no  regard  to  columns. 


Summary 

The  central  Idea  of  this  research  Is  that  procedural  networks  can  be  used  to  build 
diagnostic  models.  This  representation  scheme  facilitates  an  appropriate  decomposition  of  a 
skill  Into  subskills,  to  make  explicit  the  control  structures  underlying  a  collection  of  skills,  and 
to  execute  directly  the  resulting  diagnostic  model.  Rather  than  being  a  subset  or  simplification 
of  expert  rules,  as  In  overlay  modeling  (Carr  &  Goldstein,  1977;  see  Articles  B,  C5  and  C4), 
the  procedural  network  Is  a  model  of  a  student's  behavior  that  Is  structured  In  terms  of 
semantically  meaningful  deviations  from  the  correct  procedure.  Cach  subskill,  whether 
correct  or  Incorrect,  Is  explicitly  represented  as  a  subprocedure  In  the  network. 


References 

Brown  &  Burton  (1978)  Is  the  most  recent  and  complete  report  on  BUGGY.  Also  see 
Brown,  Burton,  Hausmann,  Goldstein,  Huggins  &  Miller  (1977)  and  Brown,  Burton,  and  Larkin 
(1977). 
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C7.  EXCHECK 

EXCHECK  Is  an  Intelligent  Computer-aided  Instruction  system  designed  and  implemented 
by  Patrick  Suppes  and  his  colleagues  at  the  Institute  of  Mathematical  Studies  In  the  Social 
Sciences  (IMSSS)  at  Stanford  University.  It  is  a  general-purpose  instructional  system  used 
principally  to  present  complete,  university-level  courses  In  logic,  set  theory,  and  proof 
theory.  In  the  courses  taught  using  the  EXCHECK  system  lesson  material  is  presented  to 
the  student  at  his  computer  terminal,  followed  by  exercises  consisting  of  theorems  that  he  Is 
to  prove  using  the  program's  theorem  prover.  The  courses  are  taught  on  IMSSS's  CAI 
system,  which  uses  computer-generated  speech  and  split-screen  displays.  Several  hundred 
Stanford  students  take  these  courses  each  year. 

From  an  Al  point  of  view,  the  most  Interesting  aspects  of  the  EXCHECK  system  are  the 
procedures  and  the  underlying  theories  of  mathematical  reasoning  that  permit  this  interaction 
to  take  place  In  a  natural  style  closely  approximating  standard  mathematical  practice.  These 
Include  natural  language  facilities,  natural-deductlon-based  proof  procedures,  theorem 
provers,  decision  procedures  for  some  simple  mathematical  theories,  procedures  for  analyzing 
and  summarizing  proofs,  and  procedures  for  conducting  dialogues  about  some  elementary 
mathematical  structures. 

Examples  of  the  kind  of  natural  language  accepted  and  generated  are  given  in  the 
proofs  and  dialogues  presented  below.  The  basic  logic  is  a  variant  of  Suppes's  (1957) 
formulation  of  natural  deduction  augmented  by  high-level  Inference  procedures  that  are  ihe 
analogs  of  proof  procedures  used  in  standard  mathematical  practice. 


Understanding  Informal  Mathematical  Reasoning 

The  mathematical  reasoning  involved  In  the  set  theory  and  proof  theory  courses  is 
complex  and  subtle.  The  fundamental  Al  problem  of  EXCHECK  is  making  the  program  capable 
of  understanding  Informal  mathematical  reasoning:  The  program  must  be  able  to  follow 
mathematical  proofs  presented  in  a  "natural"  manner.  That  Is,  just  as  the  intent  of  natural 
language  processing  is  to  handle  languages  that  are  actually  spoken,  the  intent  of  natural 
proof  processing  Is  to  handle  proofs  as  they  are  actually  done  by  practicing  mathematicians. 
In  general,  such  proofs  are  presented  by  giving  a  sketch  of  the  main  line  of  argument  along 
with  any  other  mathematically  significant  Information  that  might  be  needed  to  completely 
reconstruct  the  proof.  This  style  should  be  contrasted  with  the  derivations  familiar  from 
elementary  logic,  where  each  detail  Is  presented  and  the  focus  of  attention  is  on  syntactic 
manipulations  rati.ar  than  on  the  unde  /Ing  semantics. 

A  major  aspect  of  the  problem  of  machine  understanding  of  natural  proofs  is  finding 
languages  that  permit  users  to  express  their  proofs  in  the  fashion  described  above.  Such 
languages,  In  turn,  must  find  their  basis  In  an  analysis  or  model  of  Informal  mathematical 
reasoning.  Finding  these  natural  proof  languages  should  be  compared  to  the  problem  of 
finding  high-level  "natural"  or  "English-llke"  programming  languages.  For  more  detailed 
discussions  of  these  issues,  see  Blaine  &  Smith  (1977),  Smith  ( 1 976),  and  Smith  et  al. 
(1976).  A  simple  example  of  understanding  Informal  mathematical  reasoning  and  fuller 
discussion  of  the  techniques  Involved  follows. 
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Student  Proof 

We  presept  two  proofs  of  the  elementary  theorem, 

Thm:  If  A  c  B  then  -(B  £  A) 

where  "c"  Is  used  for  proper  subset  and  "c"  Is  used  for  subset. 

First,  the  proof  Is  given  In  the  Informal  style  of  standard  mathematical  practice. 

We  want  to  show  that  If  A  c  B,  then  ->(B  £  A). 

Assume  Ac  B.  We  show  --(B  £  A)  by  Indirect  proof.  Assume  that  Be  A.  Since  A  c 
B  then,  by  definition,  Ac  B  and  A  *  B.  Since  Ac  B  and  B  £  A  then  A  =  B.  But  this  is 
a  contradiction  and,  hence,  the  assumption  that  B  £  A  is  false.  Therefore,  ~'(B  £ 

A). 

The  following  typescript  shows  how  one  student  did  the  proof  In  the  EXCHECK  system. 
Input  from  the  student  Is  In  boldface  type  and  comments  are  In  italics ;  everything  else  is 
generated  by  the  program.  The  program  keeps  track  of  the  goal  that  the  student  is  currently 
trying  to  establish;  the  initial  goal  is  the  theorem  to  be  proven.  EXCHECK  indicates 
acceptance  of  an  Inference  by  returning  the  top-level  prompt  **;  If  a  suggested  Inference  is 
not  acceptable,  EXCHECK  returns  an  error  message. 

Derive:  If  A  c  B  then  --(B  £  A) 

**  hyp  (1)  **  A  c  B 

The  hypothesis  of  the  theorem  is  assumed.  The  goal  is  automatically  reset  to  the 
consequent  of  the  theorem. 


**  raa 

assume  (2)  **  ®  B  £  A 

The  student  begins  an  indirect  proof  ( the  command  ’raa'  is  a  mnemonic  for  reductio  ad 
absurdum).  The  program  assumes  (he  negation  of  current  goal.  The  goal  is  now  any 
contradiction. 


**  1  definition  Number  or  Name?  **  proper  subset 
1  Df.  proper  subset 

(3)  A  £  B  and  A  *  B 

T he  definition  of  proper  subset  is  applied  to  line  I. 


**  2,3establlsh  *B  =  A 
2,3  Establish 
(4)  B  =  A 


J 
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The  student  asks  the  theorem  prover  to  check  the  simple  set  theoretic  inference 


**  3,4contradiction 
Using  **  ok 
3,4  Contradiction 
(5)  B  -c  A 

The  student  indicates  that  lines  3  and  4  lead  to  a 
contradiction.  EXCHECK  returns  the  negation  of  assumption  (2). 


**  qed 
Correct 

EXCHECK  accepts  the  derivation. 

The  following  Informal  review  printout  was  generated  by  the  program  from  the  proof  given  in 
the  above  typescript. 

Derive:  if  A  c  B  then  -(B  c  A) 

Assume  (1 )  A  c  B 
By  raa  show:  — (B  £  A) 

Assume  (2)  B  c  A 

From  1 ,  by  definition  of  proper  subset, 

(3)  A  c  B  and  A  i  B 
From  2,3  It  follows  that, 

(4)  A  =  B 

3,4  lead  to  a  contradiction!  hence,  assumption  2  is  false: 

(5)  -(Be  A) 


Natural  Inference  Procedures 

There  are  no  significant  structural  differences  between  the  detailed  informal  proof  and 
the  student's  proof  as  presented  to  EXCHECK.  The  same  steps  occur  In  the  same  relations 
to  each  other.  Such  giobai  or  structural  fidelity  to  natural  proofs  is  a  major  research  goa  o 
the  EXCHECK  project  and  depends  upon  the  development  of  natural  inference  procedures. 
Some  of  these,  such  as  the  HYPOTHESIS  and  INDIRECT  PROOF  procedures  used  in  the  above 
proof,  are  familiar  from  standard  logical  systems.  The  procedure  used  in  the  application  of 
the  definition  of  proper  subset  to  line  (1)  is  called  IMPLIES.  It  is  used  to  derive  results  that, 
intuitively  speaking,  follow  by  applying  a  previous  result  or  definition,  it  is  considerably  more 
complex  than  the  inference  procedures  usually  found  In  standard  logical  systems.  An  even 
more  complex  natural  inference  procedure  used  in  the  above  proof  is  the  ESTABLISH 
procedure.  In  general,  ESTABLISH  Is  used  to  derive  results  that  are  consequences  of  prior 
results  in  the  theory  under  consideration,  in  this  case  In  the  theory  of  sets.  Eliminating  the 
need  to  cite  specific  results  In  the  theory,  which  would  disrupt  the  mam  line  or  argument,  is 
important  and  Is  discussed  lurther  In  the  section  on  ESTABLISH,  «  ’ow. 
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Al  Application*:  in  Education 


The  inference  procedures  in  EXCHECK  are  Intended  not  only  to  match  natural 
inferences  in  strength  but  also  to  match  them  in  degree  and  kind.  Howev  ,  there  are 
differences.  EXCHECK  inference  procedures  must  always  be  invoked  explicitly — in  standard 
practice,  particular  Inference  procedures  or  rules  are  usually  not  cited  explicitly.  For 
example,  compare  how  the  student  expresses  the  inferences  that  result  in  lines  (3)  and  (4) 
with  their  counterparts, In  the  Informal  proof.  The  explicit  Invocation  of  inference  procedures 
basically  requires  that  two  pieces  of  information  be  given:  first,  the  Inference  procedure  to 
be  used;  and,  second,  the  previous  results  to  be  used--in  particular,  explicit  line  numbers 
must  be  used. 

Explicitness  Is  not  disruptive  of  mathematical  reasoning--neither  is  the  reduction  of 
complex  Inferences  to  smaller  Inferences  nor  the  use  of  explicit  line  numbers  disruptive,  in 
the  sense  of  distracting  the  student  from  the  main  line  of  the  mathematical  argument.  They 
are  both  simple  elaborations  of  the  main  structure.  Hov  ever,  having  to  think  about  what 
inference  rule  to  use  can  Interrupt  the  main  line  of  argument.  The  success  of  a  system  for 
Interactively  doing  mathematics  depends  crucially  unon  having  a  few  powerful  and  natural 
inference  procedures  with  clear  criteria  of  use,  which  are  sufficient  to  handle  all  the 
Inferences. 


IMPLIES 

IMPLIES  Is  used  to  derive  results  by  applying  a  previous  result  or  definition  as  a  rule  of 
Inference  In  a  given  context.  This  form  of  inference  Is  probably  the  most  frequent  naturally 
occurring  Inference.  While  the  basic  pattern  Is  simple,  the  refinements  that  must  be  added  to 
the  basic  form  to  get  a  procedure  that  handles  most  of  the  naturally  occurring  cases  result  in 
a  computationally  complex  procedu-e.  The  following  Is  a  simple  example  of  the  basic  pattern: 

(I)  A  !s  a  subset  of  B 


i  definition  (Name  or  number)  ‘subset 

(0  (V  x)(x  t  A  ->  x  c  B) 

In  this  example,  the  student  directed  the  program  to  apply  the  definition  of  subset  .n  line  (i) 
and  IMPLIES  generated  the  result:  (V  x)(x  c  A  -*  x  e  B).  While  the  student  think’,  he  is 
applying  the  definition  of  subset  to  line  (I),  the  procedure  actually  invoked  is  the  IMPLIES 
procedure.  It  Is  Important  to  note  that  In  a  use  of  the  IMPLIES  procedure,  the  student 
Indicates  what  axiom,  definition,  theorem,  or  line  to  apply  to  which  lines,  and  the  If', PLIES 
procedure  generates  the  formula  that  Is  the  result  of  the  Inference. 

The  IMPLIES  procedure  seems  to  correspond  closely  to  na've  notions  of  Inference,  in 
that  logically  unsophisticated  mt  mathematically  sophisticated  users  can  use  it  very  well 
after  seeing  the  basic  explf  ation  and  a  few  simple  examples.  However,  the  IMPLIES  rule 
does  have  a  fault:  It  is  a  purely  logical  Inference  procedure  and  that  can  occasionally  cause 
problems  for  users,  because  mathematicians  tend  to  think  In  terms  of  set  ‘heoretlc  rather 
than  logical  consequence.  (See  the  discussion  of  the  ESTABLISH  rule  for  more  on  this 
distinction.) 
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ESTABLISH 


The  following  example  of  a  simple  use  of  ESTABLISH  Is  taken  from  the  typescript  above. 

(2)  Be  A 

(3)  Ac  B  and  A  »* 

*2,3establlsh  *B  =  A 
2,3  Establish 

(4)  B  =  A 

The  ESTABLISH  rule  allows  users  to  simply  assert  that  some  formula  Is  an  elementary  set- 
theoretic  truth  or  Is  an  elementary  set-theoretic  consequence  of  prior  results.  In  the  above 
examp^  ESTABLISH  Is  used  to  Infer  from  A  c  B  and  B  c  A  that  A  =  B.  A  =  B  Is  a  set-theoret  c 
consequence  but  not  a  logical  consequence  of  A  £  B  and  B  £  A.  If  ESTABLISH  handled  on  y 
logical  consequence,  the  student  would  have  had  to  explicitly  cite  the  relevant :  set theoretic 
theorems  or  definitions  needed  to  reduce  the  inference  to  a  purely  logical  Inference^  This  is 
not  only  disruptive  of  the  line  of  argument  but  also  difficult  to  do.  Even  the  most 
experienced  logicians  and  mathematicians  have  difficulty  ferreting  out  all  the  axioms 
definitions,  and  theorems  needed  to  reduce  even  simple  Inferences  to  purely  logical 

Inferences. 

All  of  +he  examples  so  far  are  extremely  simple  if  considered  In  terms  of  the  full 
capabilities  of  the  ESTABLISH  procedure.  ESTABLISH  uses  a  theorem  prover  that  can  prove 
about  86%  of  the  first  200  theorems  in  the  set  theory  course. 


Proof  Analysis  and  Summarization 

EXCHECK  contains  procedures  that  generate  Informal  summaries  and  sketches  of 
proofs.  Such  analyses  and  summaries  are  useful  not  only  as  a  semantic  basis  for  the  program 
to  better  understand  proofs  and  to  better  present  p  oofs,  but  also  to  give  9ll,(lflnce  ,o  t^ 
student  (see  the  proof  summary  below  for  an  examp.e  of  the  kind  of  guidance  that  can  e 
gene, Med)  The  Lm.rizktlod  procedures  analyze  .he  prop,  by  breaking  It  ,nto  par, a  (or 
"subproofs")  and  isolating  the  mathematically  important  steps.  They  also  permit  a  goal 
oriented  interpretation  of  the  proof  where  the  program  keeps  track  of  what  s  to  be 
established  at  that  point  (i.e.,  the  current  goal);  which  lines,  terms,  etc.,  are  relevant, 
how  the  current  line  or  part  fits  Into  the  whole  structure.  MYCIN's  consultation  explanation 
system  (see  article  Cl)  uses  a  similar  approach.  Goldstein  (1977)  also  uses  summarization 
techniques  in  the  rhetorical  modules  of  the  WUMPUS  coach  (article  C5). 

The  summaries  presented  below  were  generated  by  EXCHECK  from  a  student  proof  of 
the  Hausdorff  maximal  principle.  The  original  line  numbers  have  been  retained  (in 
parentheses)  In  order  to  give  a  sense  of  how  much  of  the  proof  has  been  omitted  in  the 
summary.  In  the  first  summary  only  the  top-level  part  of  the  proof  is  presented;  the  proofs 
of  its  subparts  are  omitted.  Also,  all  mathematically  or  logically  insignificant  '"formation  is 
omitted,  in  these  proofs  and  summaries  "D  contains  E  "Is  synonymous  with  E  £  D  Also  C 
Is  a  chain  iff  both  C  Is  a  set  of  sets,  and  given  any  two  elements  of  C,  at  least 
subset  of  the  other. 
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Derive:  If  A  Is  a  family  of  sets  then 
every  chain  contained  In  A  Is  contained  in  some  maximal  chain  in  A 

Proof: 

Assume  (1)  A  is  a  family  of  sets 
Assume  (2)  C  is  a  chain  and  C  c  A 
Abbreviate:  {B:  B  is  a  chain  and  C  c  B  and  B  c  A) 
by:  Clchains 
By  Zorn's  lemma, 

(23)  Clchains  has  a  maximal  element 
Let  B  be  such  that 

(24)  B  is, a  maximal-element  of  Clchains 
Hence, 

(26)  B  is  a  chain  and  C  c  B  and  Be  A 
It  follows  that, 

(31 )  B  is  a  maximal  chain  in  A 
Therefore, 

(32)  C  Is  contained  in  some  maximal  chain  in  A 

Figure  1.  Informal  summary  of  a  proof  of  the  ilausdorff 
maximal  principle. 

The  summary  above  is  not  the  only  one  that  could  be  generated;  it  essentially  presents  only 
the  main  part  of  the  proof.  Subparts  of  the  main  part  could  have  been  included  or  even 
handled  Independently  If  so  desired. 

The  proof  analysis  and  summarization  procedures  will  also  generate  the  following  kind 
of  summary,  which  Is  an  attempt  to  sketch  the  basic  idea  of  the  proof. 

Derive:  If  A  Is  a  family  of  sets  then 
every  chain  contained  in  A  Is  contained  in  some  maximal  chain  In  A 

Proof: 

Use  Zorn's  lemma  to  show  that 

{B:  B  is  a  chain  and  C  c  B  and  B  £  A) 

contains  a  maximal  element  B.  Then  show  that  B  is  a  maximal  chain  In 
A  which  contains  C. 

Figure  2.  An  example  summarization. 


The  summarization  in  Figure  2  was  obtained  from  that  in  Figure  1  by  tracing  backwards 
the  history  of  the  maximal  chain  in  A  that  contains  C.  That  Is,  the  general  form  of  the 
theorem  to  be  proven  Is  (3  x)FM(x),  which  is  proven  by  showing  FM(t)  for  some  term  t. 
Usually,  in  proofs  of  this  form,  the  most  Important  piece  of  Information  Is  the  term  t.  Tracing 
backwards  in  this  particular  proof  yields  that  there  are  two  terms  involved.  The  first  is  the 
set  of  all  chains  In  A  containing  C,  and  the  second  is  any  maximal  element  of  the  set  of  all 
chains  In  A  containing  C. 
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Elementary  Exercises  and  Dialers 

Another  form  of  reasoning  done  by  students  Is  the  solution  of  problems.  A  great  many 
problems  In  elementary  mathematics  take  the  form  of  asking  the  student  to  give  finite 
objects  satisfying  certain  conditions.  For  example,  given  the  finite  sets  A  and  B  the  student 
might  be  asked  to  give  a  function  F  that  is  a  bijection  (i.e„  1-1  and  onto)  from  A  to  B.  For  a 
large  class  of  such  problems  there  are  programs  that  will  generate  a  tree  of  formulas  and 
other  Information  from  the  original  statement  of  the  problem.  We  call  such  trees  verification 
trees  for  the  problem.  Essentially,  the  verification  tree  for  a  problem  constitutes  a  reduction 
of  the  original  (usually  not  directly  verifiable)  condition  to  a  collection  of  directly  verifiable 
conditions  (the  formulas  at  the  leaves).  These  trees  ht<ve  the  property  that  the  failure  of 
the  formula  at  a  node  in  the  tree  explains  the  failure  of  formulas  at  any  of  its  ancestors. 
Similarly,  the  failure  of  a  formula  at  a  node  is  explained  by  the  failure  of  formulas  at  any  of 
its  descendants. 

For  example,  In  the  above  problem  of  supplying  a  bijection  F  from  A  onto  B,  suppose 
that  the  student  forgets  to  specify  a  vaiue  for  some  element  of  A,  say,  3.  The  first  response 
to  the  student  might  be:  "The  domain  of  F  isn't  A."  The  student  might  then  ask:  "  Why?"  The 
program  would  then  answer  (going  towards  the  leaves),  "Because  there  Is  an  element  of  A 
that  has  not  been  assigned  a  value  in  B."  The  student  might  then  ask,  "Which  one?"  Since 
the  routines  that  evaluate  the  formulas  at  the  leaves  provide  counterexamples  if  those 
formulas  fail,  the  program  could  then  respond,  "3."  Or  going  back  to  the  first  response  by  the 
program  ("The  domain  of  F  Isn't  A"),  the  student  might  say,  "So?"  The  program  could  then 
move  a  step  towards  the  root  (the  original  statement  of  the  conditions)  and  say,  "Then  F  is 
not  a  map  from  A  Into  B."  The  student  might  then  again  say,  "So?",  to  which  the  program 
could  respond,  "F  is  not  a  bijection  from  A  onto  B." 

The  highly  structured  information  In  the  verification  tree  provides  the  semantic  base  for 
a  dialogue  with  the  student  in  which  the  program  can  explain  to  the  student  what  is  wrong 
with  the  answer.  It  should  be  noted  that  more  complex  forms  of  explanation  are  available. 

In  particular,  the  program  could  hove  said  at  the  beginning  that,  "Because  3  is  not  given  a 
value  by  F,  the  domain  of  F  Is  not  A  and  hence  F  Is  not  a  bijection  from  A  onto  B." 


Summary 

A  primary  activity  In  mathematics  Is  finding  and  presenting  proofs.  In  the  EXCHECK 
system  an  attempt  is  made  to  handle  natural  proofs--proofs  as  they  are  actually  done  by 
practicing  mathematlclans--lnstead  of  requiring  that  these  proofs  be  expressed  as 
derivations  in  an  elementary  system  of  first  order  logic.  This  objective  requires  the  analysis 
of  Inferences  actually  made  and  the  design  and  Implementation  of  languages  and  procedures 
that  permit  such  inferences  to  be  easily  stated  and  mechanically  verified.  Some  progress  has 
been  made  in  handling  natural  proofs  In  elementary  mathematics,  but  there  Is  a  considerable 
amount  of  work  yet  to  be  done, 
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